decoding.scorers
Scorers are objects that take an instance of decoding.pmf.LogPMF
and return a list of decoding.pmf.ScoredItem
instances that are sorted by their
scores. The scores are computed by a function that is passed to the constructor
of the Scorer
object.
The Scorer
class is a frozen dataclass that wraps this scoring function. The
class supports constructors that enable the preparation of this scoring function
from a variety of input types, depending on what is most convenient for the user.
For example, the user can choose whether to engage with or simply allow the class
to coordinate details like batching, weighting, and parallelization.
NB: The examples below are illustrative of the API, but are not particularly
meaningful or interesting. Check out the
TUTORIAL.md
for more practical examples of scoring functions in action.
1""" 2Scorers are objects that take an instance of `decoding.pmf.LogPMF` 3and return a list of `decoding.pmf.ScoredItem` instances that are sorted by their 4scores. The scores are computed by a function that is passed to the constructor 5of the `Scorer` object. 6 7The `Scorer` class is a frozen dataclass that wraps this scoring function. The 8class supports constructors that enable the preparation of this scoring function 9from a variety of input types, depending on what is most convenient for the user. 10For example, the user can choose whether to engage with or simply allow the class 11to coordinate details like batching, weighting, and parallelization. 12 13**NB**: The examples below are illustrative of the API, but are not particularly 14meaningful or interesting. Check out the 15[`TUTORIAL.md`](https://github.com/benlipkin/decoding/blob/main/TUTORIAL.md) 16for more practical examples of scoring functions in action. 17""" 18 19from collections.abc import Callable, Sequence 20from concurrent.futures import ThreadPoolExecutor 21from dataclasses import dataclass 22 23from decoding.pmf import LogPMF, ScoredItem, make_scored_items, sort_scored_items 24from decoding.types import NUM 25 26 27@dataclass(frozen=True, kw_only=True) 28class Scorer: 29 """ 30 The `Scorer` class wraps and coordinates user-supplied scoring functions. 31 """ 32 33 _f: Callable[[LogPMF[str]], list[ScoredItem[str]]] 34 35 def __call__(self, d: LogPMF[str]) -> list[ScoredItem[str]]: 36 """ 37 `__call__` is an alias for `score`. 38 """ 39 return self.score(d) 40 41 def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]: 42 """ 43 Process a `decoding.pmf.LogPMF` instance and returns a list 44 of `decoding.pmf.ScoredItem` instances that are sorted by their scores. 45 46 Args: 47 d: A `decoding.pmf.LogPMF` instance. 48 49 Returns: 50 A list of `decoding.pmf.ScoredItem` instances that are sorted 51 by their scores. 52 53 Example: 54 ```python 55 from decoding.pmf import LogPMF 56 from decoding.scorers import Scorer 57 58 scorer = Scorer.from_f_str_to_num(lambda x: len(x)) 59 d = LogPMF.from_samples(["a", "bb", "ccc"]) 60 samples = scorer(d) 61 assert samples[0].item == "ccc" 62 assert samples[0].score == 3 63 ``` 64 65 """ 66 return sort_scored_items(self._f(d)) 67 68 @classmethod 69 def from_f_str_to_num( 70 cls, f: Callable[[str], NUM], *, parallelize: bool = False 71 ) -> "Scorer": 72 """ 73 Construct a `Scorer` object from a function that maps a string to a 74 number. The `Scorer` object will then score a 75 `decoding.pmf.LogPMF` instance by applying this function to 76 each of its categories. 77 78 Args: 79 f: A function that maps a string to a number. 80 parallelize: A boolean indicating whether to parallelize 81 the scoring process. 82 83 Returns: 84 A `Scorer` object. 85 86 Example: 87 ```python 88 from decoding.pmf import LogPMF 89 from decoding.scorers import Scorer 90 91 scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True) 92 d = LogPMF.from_samples(["a", "bb", "ccc"]) 93 samples = scorer(d) 94 assert samples[-1].item == "a" 95 assert samples[-1].score == 1 96 ``` 97 98 """ 99 100 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 101 if parallelize: 102 with ThreadPoolExecutor() as e: 103 utilities = list(e.map(f, d.items)) 104 else: 105 utilities = list(map(f, d.items)) 106 return make_scored_items(d.items, utilities) 107 108 return cls(_f=_f) 109 110 @classmethod 111 def from_f_batch_str_to_batch_num( 112 cls, f: Callable[[Sequence[str]], Sequence[NUM]] 113 ) -> "Scorer": 114 """ 115 Construct a `Scorer` object from a function that maps a sequence of 116 strings to a sequence of numbers. The `Scorer` object will then score 117 a `decoding.pmf.LogPMF` instance by applying this function 118 to its categories. 119 120 Args: 121 f: A function that maps a sequence of strings to a sequence of numbers. 122 123 Returns: 124 A `Scorer` object. 125 126 Example: 127 ```python 128 from decoding.pmf import LogPMF 129 from decoding.scorers import Scorer 130 131 scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x]) 132 d = LogPMF.from_samples(["a", "bb", "ccc"]) 133 samples = scorer(d) 134 assert samples[0].item == "ccc" 135 assert samples[0].score == 3 136 ``` 137 138 """ 139 140 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 141 utilities = f(d.items) 142 return make_scored_items(d.items, utilities) 143 144 return cls(_f=_f) 145 146 @classmethod 147 def from_f_logpmf_to_batch_num( 148 cls, f: Callable[[LogPMF[str]], Sequence[NUM]] 149 ) -> "Scorer": 150 """ 151 Construct a `Scorer` object from a function that maps a 152 `decoding.pmf.LogPMF` instance to a sequence of numbers. 153 The `Scorer` object will then score the 154 `decoding.pmf.LogPMF` instance directly. 155 156 Args: 157 f: A function that maps a `decoding.pmf.LogPMF` 158 instance to a sequence of numbers. 159 160 Returns: 161 A `Scorer` object. 162 163 Example: 164 ```python 165 import jax.numpy as jnp 166 from decoding.pmf import LogPMF 167 from decoding.scorers import Scorer 168 169 f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d] 170 scorer = Scorer.from_f_logpmf_to_batch_num(f) 171 d = LogPMF.from_samples(["a", "bb", "bb", "ccc"]) 172 samples = scorer(d) 173 assert samples[0].item == "bb" 174 assert samples[0].score == 1.0 175 assert samples[1].item == "ccc" 176 assert samples[1].score == 0.75 177 assert samples[2].item == "a" 178 assert samples[2].score == 0.25 179 ``` 180 181 """ 182 183 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 184 utilities = f(d) 185 return make_scored_items(d.items, utilities) 186 187 return cls(_f=_f) 188 189 @classmethod 190 def from_f_str_to_item( 191 cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False 192 ) -> "Scorer": 193 """ 194 Construct a `Scorer` object from a function that maps a string to a 195 `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a 196 `decoding.pmf.LogPMF` instance by applying this function to 197 each of its categories. This allows us to update not only the score 198 values but also the items themselves. 199 200 Args: 201 f: A function that maps a string to a `decoding.pmf.ScoredItem` instance. 202 parallelize: A boolean indicating whether to parallelize 203 the scoring process. 204 205 Returns: 206 A `Scorer` object. 207 208 Example: 209 ```python 210 from decoding.pmf import LogPMF, ScoredItem 211 from decoding.scorers import Scorer 212 213 def f(x): 214 if x.endswith("."): 215 return ScoredItem(item=x[:-1], score=len(x)-1) 216 return ScoredItem(item=x, score=len(x)) 217 218 scorer = Scorer.from_f_str_to_item(f, parallelize=True) 219 d = LogPMF.from_samples(["a", "bb.", "ccc"]) 220 samples = scorer(d) 221 assert samples[0].item == "ccc" 222 assert samples[0].score == 3 223 assert samples[1].item == "bb" 224 assert samples[1].score == 2 225 ``` 226 227 """ 228 229 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 230 if parallelize: 231 with ThreadPoolExecutor() as e: 232 return list(e.map(f, d.items)) 233 else: 234 return list(map(f, d.items)) 235 236 return cls(_f=_f) 237 238 @classmethod 239 def from_f_batch_str_to_batch_item( 240 cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]] 241 ) -> "Scorer": 242 """ 243 Construct a `Scorer` object from a function that maps a sequence of strings 244 to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will 245 then score a `decoding.pmf.LogPMF` instance by applying this function 246 to its categories. This allows us to update not only the score values but 247 also the items themselves. 248 249 Args: 250 f: A function that maps a sequence of strings to a sequence of 251 `decoding.pmf.ScoredItem` instances. 252 253 Returns: 254 A `Scorer` object. 255 256 Example: 257 ```python 258 from decoding.pmf import LogPMF, ScoredItem 259 from decoding.scorers import Scorer 260 261 f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs] 262 scorer = Scorer.from_f_batch_str_to_batch_item(f) 263 d = LogPMF.from_samples(["_a", "_bb", "_ccc"]) 264 samples = scorer(d) 265 assert samples[0].item == "ccc" 266 assert samples[0].score == 3 267 ``` 268 269 """ 270 271 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 272 return list(f(d.items)) 273 274 return cls(_f=_f) 275 276 @classmethod 277 def from_f_logpmf_to_batch_item( 278 cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]] 279 ) -> "Scorer": 280 """ 281 Construct a `Scorer` object from a function that maps a 282 `decoding.pmf.LogPMF` instance to a sequence of 283 `decoding.pmf.ScoredItem` instances. This type signature actually 284 matches much of the `decoding.estimators` module, so this constructor 285 is particularly useful for building `Scorer` instances based on 286 `decoding.estimators.MBR`, etc. 287 288 Args: 289 f: A function that maps a `decoding.pmf.LogPMF` 290 instance to a sequence of `decoding.pmf.ScoredItem` instances. 291 292 Returns: 293 A `Scorer` object. 294 295 Example: 296 ```python 297 import jax.numpy as jnp 298 from decoding.estimators import MBR 299 from decoding.pmf import LogPMF 300 from decoding.scorers import Scorer 301 302 f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2) 303 scorer = Scorer.from_f_logpmf_to_batch_item(f) 304 d = LogPMF.from_samples(["aa", "bb", "cc"]) 305 samples = scorer(d) 306 assert samples[0].item == "aa" 307 assert jnp.isclose(samples[0].score, 2/3) 308 assert samples[1].item == "bb" 309 assert jnp.isclose(samples[1].score, 1/3) 310 ``` 311 312 """ 313 314 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 315 return list(f(d)) 316 317 return cls(_f=_f)
28@dataclass(frozen=True, kw_only=True) 29class Scorer: 30 """ 31 The `Scorer` class wraps and coordinates user-supplied scoring functions. 32 """ 33 34 _f: Callable[[LogPMF[str]], list[ScoredItem[str]]] 35 36 def __call__(self, d: LogPMF[str]) -> list[ScoredItem[str]]: 37 """ 38 `__call__` is an alias for `score`. 39 """ 40 return self.score(d) 41 42 def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]: 43 """ 44 Process a `decoding.pmf.LogPMF` instance and returns a list 45 of `decoding.pmf.ScoredItem` instances that are sorted by their scores. 46 47 Args: 48 d: A `decoding.pmf.LogPMF` instance. 49 50 Returns: 51 A list of `decoding.pmf.ScoredItem` instances that are sorted 52 by their scores. 53 54 Example: 55 ```python 56 from decoding.pmf import LogPMF 57 from decoding.scorers import Scorer 58 59 scorer = Scorer.from_f_str_to_num(lambda x: len(x)) 60 d = LogPMF.from_samples(["a", "bb", "ccc"]) 61 samples = scorer(d) 62 assert samples[0].item == "ccc" 63 assert samples[0].score == 3 64 ``` 65 66 """ 67 return sort_scored_items(self._f(d)) 68 69 @classmethod 70 def from_f_str_to_num( 71 cls, f: Callable[[str], NUM], *, parallelize: bool = False 72 ) -> "Scorer": 73 """ 74 Construct a `Scorer` object from a function that maps a string to a 75 number. The `Scorer` object will then score a 76 `decoding.pmf.LogPMF` instance by applying this function to 77 each of its categories. 78 79 Args: 80 f: A function that maps a string to a number. 81 parallelize: A boolean indicating whether to parallelize 82 the scoring process. 83 84 Returns: 85 A `Scorer` object. 86 87 Example: 88 ```python 89 from decoding.pmf import LogPMF 90 from decoding.scorers import Scorer 91 92 scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True) 93 d = LogPMF.from_samples(["a", "bb", "ccc"]) 94 samples = scorer(d) 95 assert samples[-1].item == "a" 96 assert samples[-1].score == 1 97 ``` 98 99 """ 100 101 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 102 if parallelize: 103 with ThreadPoolExecutor() as e: 104 utilities = list(e.map(f, d.items)) 105 else: 106 utilities = list(map(f, d.items)) 107 return make_scored_items(d.items, utilities) 108 109 return cls(_f=_f) 110 111 @classmethod 112 def from_f_batch_str_to_batch_num( 113 cls, f: Callable[[Sequence[str]], Sequence[NUM]] 114 ) -> "Scorer": 115 """ 116 Construct a `Scorer` object from a function that maps a sequence of 117 strings to a sequence of numbers. The `Scorer` object will then score 118 a `decoding.pmf.LogPMF` instance by applying this function 119 to its categories. 120 121 Args: 122 f: A function that maps a sequence of strings to a sequence of numbers. 123 124 Returns: 125 A `Scorer` object. 126 127 Example: 128 ```python 129 from decoding.pmf import LogPMF 130 from decoding.scorers import Scorer 131 132 scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x]) 133 d = LogPMF.from_samples(["a", "bb", "ccc"]) 134 samples = scorer(d) 135 assert samples[0].item == "ccc" 136 assert samples[0].score == 3 137 ``` 138 139 """ 140 141 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 142 utilities = f(d.items) 143 return make_scored_items(d.items, utilities) 144 145 return cls(_f=_f) 146 147 @classmethod 148 def from_f_logpmf_to_batch_num( 149 cls, f: Callable[[LogPMF[str]], Sequence[NUM]] 150 ) -> "Scorer": 151 """ 152 Construct a `Scorer` object from a function that maps a 153 `decoding.pmf.LogPMF` instance to a sequence of numbers. 154 The `Scorer` object will then score the 155 `decoding.pmf.LogPMF` instance directly. 156 157 Args: 158 f: A function that maps a `decoding.pmf.LogPMF` 159 instance to a sequence of numbers. 160 161 Returns: 162 A `Scorer` object. 163 164 Example: 165 ```python 166 import jax.numpy as jnp 167 from decoding.pmf import LogPMF 168 from decoding.scorers import Scorer 169 170 f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d] 171 scorer = Scorer.from_f_logpmf_to_batch_num(f) 172 d = LogPMF.from_samples(["a", "bb", "bb", "ccc"]) 173 samples = scorer(d) 174 assert samples[0].item == "bb" 175 assert samples[0].score == 1.0 176 assert samples[1].item == "ccc" 177 assert samples[1].score == 0.75 178 assert samples[2].item == "a" 179 assert samples[2].score == 0.25 180 ``` 181 182 """ 183 184 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 185 utilities = f(d) 186 return make_scored_items(d.items, utilities) 187 188 return cls(_f=_f) 189 190 @classmethod 191 def from_f_str_to_item( 192 cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False 193 ) -> "Scorer": 194 """ 195 Construct a `Scorer` object from a function that maps a string to a 196 `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a 197 `decoding.pmf.LogPMF` instance by applying this function to 198 each of its categories. This allows us to update not only the score 199 values but also the items themselves. 200 201 Args: 202 f: A function that maps a string to a `decoding.pmf.ScoredItem` instance. 203 parallelize: A boolean indicating whether to parallelize 204 the scoring process. 205 206 Returns: 207 A `Scorer` object. 208 209 Example: 210 ```python 211 from decoding.pmf import LogPMF, ScoredItem 212 from decoding.scorers import Scorer 213 214 def f(x): 215 if x.endswith("."): 216 return ScoredItem(item=x[:-1], score=len(x)-1) 217 return ScoredItem(item=x, score=len(x)) 218 219 scorer = Scorer.from_f_str_to_item(f, parallelize=True) 220 d = LogPMF.from_samples(["a", "bb.", "ccc"]) 221 samples = scorer(d) 222 assert samples[0].item == "ccc" 223 assert samples[0].score == 3 224 assert samples[1].item == "bb" 225 assert samples[1].score == 2 226 ``` 227 228 """ 229 230 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 231 if parallelize: 232 with ThreadPoolExecutor() as e: 233 return list(e.map(f, d.items)) 234 else: 235 return list(map(f, d.items)) 236 237 return cls(_f=_f) 238 239 @classmethod 240 def from_f_batch_str_to_batch_item( 241 cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]] 242 ) -> "Scorer": 243 """ 244 Construct a `Scorer` object from a function that maps a sequence of strings 245 to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will 246 then score a `decoding.pmf.LogPMF` instance by applying this function 247 to its categories. This allows us to update not only the score values but 248 also the items themselves. 249 250 Args: 251 f: A function that maps a sequence of strings to a sequence of 252 `decoding.pmf.ScoredItem` instances. 253 254 Returns: 255 A `Scorer` object. 256 257 Example: 258 ```python 259 from decoding.pmf import LogPMF, ScoredItem 260 from decoding.scorers import Scorer 261 262 f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs] 263 scorer = Scorer.from_f_batch_str_to_batch_item(f) 264 d = LogPMF.from_samples(["_a", "_bb", "_ccc"]) 265 samples = scorer(d) 266 assert samples[0].item == "ccc" 267 assert samples[0].score == 3 268 ``` 269 270 """ 271 272 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 273 return list(f(d.items)) 274 275 return cls(_f=_f) 276 277 @classmethod 278 def from_f_logpmf_to_batch_item( 279 cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]] 280 ) -> "Scorer": 281 """ 282 Construct a `Scorer` object from a function that maps a 283 `decoding.pmf.LogPMF` instance to a sequence of 284 `decoding.pmf.ScoredItem` instances. This type signature actually 285 matches much of the `decoding.estimators` module, so this constructor 286 is particularly useful for building `Scorer` instances based on 287 `decoding.estimators.MBR`, etc. 288 289 Args: 290 f: A function that maps a `decoding.pmf.LogPMF` 291 instance to a sequence of `decoding.pmf.ScoredItem` instances. 292 293 Returns: 294 A `Scorer` object. 295 296 Example: 297 ```python 298 import jax.numpy as jnp 299 from decoding.estimators import MBR 300 from decoding.pmf import LogPMF 301 from decoding.scorers import Scorer 302 303 f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2) 304 scorer = Scorer.from_f_logpmf_to_batch_item(f) 305 d = LogPMF.from_samples(["aa", "bb", "cc"]) 306 samples = scorer(d) 307 assert samples[0].item == "aa" 308 assert jnp.isclose(samples[0].score, 2/3) 309 assert samples[1].item == "bb" 310 assert jnp.isclose(samples[1].score, 1/3) 311 ``` 312 313 """ 314 315 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 316 return list(f(d)) 317 318 return cls(_f=_f)
The Scorer
class wraps and coordinates user-supplied scoring functions.
42 def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]: 43 """ 44 Process a `decoding.pmf.LogPMF` instance and returns a list 45 of `decoding.pmf.ScoredItem` instances that are sorted by their scores. 46 47 Args: 48 d: A `decoding.pmf.LogPMF` instance. 49 50 Returns: 51 A list of `decoding.pmf.ScoredItem` instances that are sorted 52 by their scores. 53 54 Example: 55 ```python 56 from decoding.pmf import LogPMF 57 from decoding.scorers import Scorer 58 59 scorer = Scorer.from_f_str_to_num(lambda x: len(x)) 60 d = LogPMF.from_samples(["a", "bb", "ccc"]) 61 samples = scorer(d) 62 assert samples[0].item == "ccc" 63 assert samples[0].score == 3 64 ``` 65 66 """ 67 return sort_scored_items(self._f(d))
Process a decoding.pmf.LogPMF
instance and returns a list
of decoding.pmf.ScoredItem
instances that are sorted by their scores.
Arguments:
- d: A
decoding.pmf.LogPMF
instance.
Returns:
A list of
decoding.pmf.ScoredItem
instances that are sorted by their scores.
Example:
from decoding.pmf import LogPMF from decoding.scorers import Scorer scorer = Scorer.from_f_str_to_num(lambda x: len(x)) d = LogPMF.from_samples(["a", "bb", "ccc"]) samples = scorer(d) assert samples[0].item == "ccc" assert samples[0].score == 3
69 @classmethod 70 def from_f_str_to_num( 71 cls, f: Callable[[str], NUM], *, parallelize: bool = False 72 ) -> "Scorer": 73 """ 74 Construct a `Scorer` object from a function that maps a string to a 75 number. The `Scorer` object will then score a 76 `decoding.pmf.LogPMF` instance by applying this function to 77 each of its categories. 78 79 Args: 80 f: A function that maps a string to a number. 81 parallelize: A boolean indicating whether to parallelize 82 the scoring process. 83 84 Returns: 85 A `Scorer` object. 86 87 Example: 88 ```python 89 from decoding.pmf import LogPMF 90 from decoding.scorers import Scorer 91 92 scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True) 93 d = LogPMF.from_samples(["a", "bb", "ccc"]) 94 samples = scorer(d) 95 assert samples[-1].item == "a" 96 assert samples[-1].score == 1 97 ``` 98 99 """ 100 101 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 102 if parallelize: 103 with ThreadPoolExecutor() as e: 104 utilities = list(e.map(f, d.items)) 105 else: 106 utilities = list(map(f, d.items)) 107 return make_scored_items(d.items, utilities) 108 109 return cls(_f=_f)
Construct a Scorer
object from a function that maps a string to a
number. The Scorer
object will then score a
decoding.pmf.LogPMF
instance by applying this function to
each of its categories.
Arguments:
- f: A function that maps a string to a number.
- parallelize: A boolean indicating whether to parallelize the scoring process.
Returns:
A
Scorer
object.
Example:
from decoding.pmf import LogPMF from decoding.scorers import Scorer scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True) d = LogPMF.from_samples(["a", "bb", "ccc"]) samples = scorer(d) assert samples[-1].item == "a" assert samples[-1].score == 1
111 @classmethod 112 def from_f_batch_str_to_batch_num( 113 cls, f: Callable[[Sequence[str]], Sequence[NUM]] 114 ) -> "Scorer": 115 """ 116 Construct a `Scorer` object from a function that maps a sequence of 117 strings to a sequence of numbers. The `Scorer` object will then score 118 a `decoding.pmf.LogPMF` instance by applying this function 119 to its categories. 120 121 Args: 122 f: A function that maps a sequence of strings to a sequence of numbers. 123 124 Returns: 125 A `Scorer` object. 126 127 Example: 128 ```python 129 from decoding.pmf import LogPMF 130 from decoding.scorers import Scorer 131 132 scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x]) 133 d = LogPMF.from_samples(["a", "bb", "ccc"]) 134 samples = scorer(d) 135 assert samples[0].item == "ccc" 136 assert samples[0].score == 3 137 ``` 138 139 """ 140 141 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 142 utilities = f(d.items) 143 return make_scored_items(d.items, utilities) 144 145 return cls(_f=_f)
Construct a Scorer
object from a function that maps a sequence of
strings to a sequence of numbers. The Scorer
object will then score
a decoding.pmf.LogPMF
instance by applying this function
to its categories.
Arguments:
- f: A function that maps a sequence of strings to a sequence of numbers.
Returns:
A
Scorer
object.
Example:
from decoding.pmf import LogPMF from decoding.scorers import Scorer scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x]) d = LogPMF.from_samples(["a", "bb", "ccc"]) samples = scorer(d) assert samples[0].item == "ccc" assert samples[0].score == 3
147 @classmethod 148 def from_f_logpmf_to_batch_num( 149 cls, f: Callable[[LogPMF[str]], Sequence[NUM]] 150 ) -> "Scorer": 151 """ 152 Construct a `Scorer` object from a function that maps a 153 `decoding.pmf.LogPMF` instance to a sequence of numbers. 154 The `Scorer` object will then score the 155 `decoding.pmf.LogPMF` instance directly. 156 157 Args: 158 f: A function that maps a `decoding.pmf.LogPMF` 159 instance to a sequence of numbers. 160 161 Returns: 162 A `Scorer` object. 163 164 Example: 165 ```python 166 import jax.numpy as jnp 167 from decoding.pmf import LogPMF 168 from decoding.scorers import Scorer 169 170 f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d] 171 scorer = Scorer.from_f_logpmf_to_batch_num(f) 172 d = LogPMF.from_samples(["a", "bb", "bb", "ccc"]) 173 samples = scorer(d) 174 assert samples[0].item == "bb" 175 assert samples[0].score == 1.0 176 assert samples[1].item == "ccc" 177 assert samples[1].score == 0.75 178 assert samples[2].item == "a" 179 assert samples[2].score == 0.25 180 ``` 181 182 """ 183 184 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 185 utilities = f(d) 186 return make_scored_items(d.items, utilities) 187 188 return cls(_f=_f)
Construct a Scorer
object from a function that maps a
decoding.pmf.LogPMF
instance to a sequence of numbers.
The Scorer
object will then score the
decoding.pmf.LogPMF
instance directly.
Arguments:
- f: A function that maps a
decoding.pmf.LogPMF
instance to a sequence of numbers.
Returns:
A
Scorer
object.
Example:
import jax.numpy as jnp from decoding.pmf import LogPMF from decoding.scorers import Scorer f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d] scorer = Scorer.from_f_logpmf_to_batch_num(f) d = LogPMF.from_samples(["a", "bb", "bb", "ccc"]) samples = scorer(d) assert samples[0].item == "bb" assert samples[0].score == 1.0 assert samples[1].item == "ccc" assert samples[1].score == 0.75 assert samples[2].item == "a" assert samples[2].score == 0.25
190 @classmethod 191 def from_f_str_to_item( 192 cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False 193 ) -> "Scorer": 194 """ 195 Construct a `Scorer` object from a function that maps a string to a 196 `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a 197 `decoding.pmf.LogPMF` instance by applying this function to 198 each of its categories. This allows us to update not only the score 199 values but also the items themselves. 200 201 Args: 202 f: A function that maps a string to a `decoding.pmf.ScoredItem` instance. 203 parallelize: A boolean indicating whether to parallelize 204 the scoring process. 205 206 Returns: 207 A `Scorer` object. 208 209 Example: 210 ```python 211 from decoding.pmf import LogPMF, ScoredItem 212 from decoding.scorers import Scorer 213 214 def f(x): 215 if x.endswith("."): 216 return ScoredItem(item=x[:-1], score=len(x)-1) 217 return ScoredItem(item=x, score=len(x)) 218 219 scorer = Scorer.from_f_str_to_item(f, parallelize=True) 220 d = LogPMF.from_samples(["a", "bb.", "ccc"]) 221 samples = scorer(d) 222 assert samples[0].item == "ccc" 223 assert samples[0].score == 3 224 assert samples[1].item == "bb" 225 assert samples[1].score == 2 226 ``` 227 228 """ 229 230 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 231 if parallelize: 232 with ThreadPoolExecutor() as e: 233 return list(e.map(f, d.items)) 234 else: 235 return list(map(f, d.items)) 236 237 return cls(_f=_f)
Construct a Scorer
object from a function that maps a string to a
decoding.pmf.ScoredItem
instance. The Scorer
object will then score a
decoding.pmf.LogPMF
instance by applying this function to
each of its categories. This allows us to update not only the score
values but also the items themselves.
Arguments:
- f: A function that maps a string to a
decoding.pmf.ScoredItem
instance. - parallelize: A boolean indicating whether to parallelize the scoring process.
Returns:
A
Scorer
object.
Example:
from decoding.pmf import LogPMF, ScoredItem from decoding.scorers import Scorer def f(x): if x.endswith("."): return ScoredItem(item=x[:-1], score=len(x)-1) return ScoredItem(item=x, score=len(x)) scorer = Scorer.from_f_str_to_item(f, parallelize=True) d = LogPMF.from_samples(["a", "bb.", "ccc"]) samples = scorer(d) assert samples[0].item == "ccc" assert samples[0].score == 3 assert samples[1].item == "bb" assert samples[1].score == 2
239 @classmethod 240 def from_f_batch_str_to_batch_item( 241 cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]] 242 ) -> "Scorer": 243 """ 244 Construct a `Scorer` object from a function that maps a sequence of strings 245 to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will 246 then score a `decoding.pmf.LogPMF` instance by applying this function 247 to its categories. This allows us to update not only the score values but 248 also the items themselves. 249 250 Args: 251 f: A function that maps a sequence of strings to a sequence of 252 `decoding.pmf.ScoredItem` instances. 253 254 Returns: 255 A `Scorer` object. 256 257 Example: 258 ```python 259 from decoding.pmf import LogPMF, ScoredItem 260 from decoding.scorers import Scorer 261 262 f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs] 263 scorer = Scorer.from_f_batch_str_to_batch_item(f) 264 d = LogPMF.from_samples(["_a", "_bb", "_ccc"]) 265 samples = scorer(d) 266 assert samples[0].item == "ccc" 267 assert samples[0].score == 3 268 ``` 269 270 """ 271 272 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 273 return list(f(d.items)) 274 275 return cls(_f=_f)
Construct a Scorer
object from a function that maps a sequence of strings
to a sequence of decoding.pmf.ScoredItem
instances. The Scorer
object will
then score a decoding.pmf.LogPMF
instance by applying this function
to its categories. This allows us to update not only the score values but
also the items themselves.
Arguments:
- f: A function that maps a sequence of strings to a sequence of
decoding.pmf.ScoredItem
instances.
Returns:
A
Scorer
object.
Example:
from decoding.pmf import LogPMF, ScoredItem from decoding.scorers import Scorer f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs] scorer = Scorer.from_f_batch_str_to_batch_item(f) d = LogPMF.from_samples(["_a", "_bb", "_ccc"]) samples = scorer(d) assert samples[0].item == "ccc" assert samples[0].score == 3
277 @classmethod 278 def from_f_logpmf_to_batch_item( 279 cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]] 280 ) -> "Scorer": 281 """ 282 Construct a `Scorer` object from a function that maps a 283 `decoding.pmf.LogPMF` instance to a sequence of 284 `decoding.pmf.ScoredItem` instances. This type signature actually 285 matches much of the `decoding.estimators` module, so this constructor 286 is particularly useful for building `Scorer` instances based on 287 `decoding.estimators.MBR`, etc. 288 289 Args: 290 f: A function that maps a `decoding.pmf.LogPMF` 291 instance to a sequence of `decoding.pmf.ScoredItem` instances. 292 293 Returns: 294 A `Scorer` object. 295 296 Example: 297 ```python 298 import jax.numpy as jnp 299 from decoding.estimators import MBR 300 from decoding.pmf import LogPMF 301 from decoding.scorers import Scorer 302 303 f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2) 304 scorer = Scorer.from_f_logpmf_to_batch_item(f) 305 d = LogPMF.from_samples(["aa", "bb", "cc"]) 306 samples = scorer(d) 307 assert samples[0].item == "aa" 308 assert jnp.isclose(samples[0].score, 2/3) 309 assert samples[1].item == "bb" 310 assert jnp.isclose(samples[1].score, 1/3) 311 ``` 312 313 """ 314 315 def _f(d: LogPMF[str]) -> list[ScoredItem[str]]: 316 return list(f(d)) 317 318 return cls(_f=_f)
Construct a Scorer
object from a function that maps a
decoding.pmf.LogPMF
instance to a sequence of
decoding.pmf.ScoredItem
instances. This type signature actually
matches much of the decoding.estimators
module, so this constructor
is particularly useful for building Scorer
instances based on
decoding.estimators.MBR
, etc.
Arguments:
- f: A function that maps a
decoding.pmf.LogPMF
instance to a sequence ofdecoding.pmf.ScoredItem
instances.
Returns:
A
Scorer
object.
Example:
import jax.numpy as jnp from decoding.estimators import MBR from decoding.pmf import LogPMF from decoding.scorers import Scorer f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2) scorer = Scorer.from_f_logpmf_to_batch_item(f) d = LogPMF.from_samples(["aa", "bb", "cc"]) samples = scorer(d) assert samples[0].item == "aa" assert jnp.isclose(samples[0].score, 2/3) assert samples[1].item == "bb" assert jnp.isclose(samples[1].score, 1/3)