decoding.scorers

Scorers are objects that take an instance of decoding.pmf.LogPMF and return a list of decoding.pmf.ScoredItem instances that are sorted by their scores. The scores are computed by a function that is passed to the constructor of the Scorer object.

The Scorer class is a frozen dataclass that wraps this scoring function. The class supports constructors that enable the preparation of this scoring function from a variety of input types, depending on what is most convenient for the user. For example, the user can choose whether to engage with or simply allow the class to coordinate details like batching, weighting, and parallelization.

NB: The examples below are illustrative of the API, but are not particularly meaningful or interesting. Check out the TUTORIAL.md for more practical examples of scoring functions in action.

  1"""
  2Scorers are objects that take an instance of `decoding.pmf.LogPMF`
  3and return a list of `decoding.pmf.ScoredItem` instances that are sorted by their
  4scores. The scores are computed by a function that is passed to the constructor
  5of the `Scorer` object.
  6
  7The `Scorer` class is a frozen dataclass that wraps this scoring function. The
  8class supports constructors that enable the preparation of this scoring function
  9from a variety of input types, depending on what is most convenient for the user.
 10For example, the user can choose whether to engage with or simply allow the class
 11to coordinate details like batching, weighting, and parallelization.
 12
 13**NB**: The examples below are illustrative of the API, but are not particularly
 14meaningful or interesting. Check out the
 15[`TUTORIAL.md`](https://github.com/benlipkin/decoding/blob/main/TUTORIAL.md)
 16for more practical examples of scoring functions in action.
 17"""
 18
 19from collections.abc import Callable, Sequence
 20from concurrent.futures import ThreadPoolExecutor
 21from dataclasses import dataclass
 22
 23from decoding.pmf import LogPMF, ScoredItem, make_scored_items, sort_scored_items
 24from decoding.types import NUM
 25
 26
 27@dataclass(frozen=True, kw_only=True)
 28class Scorer:
 29    """
 30    The `Scorer` class wraps and coordinates user-supplied scoring functions.
 31    """
 32
 33    _f: Callable[[LogPMF[str]], list[ScoredItem[str]]]
 34
 35    def __call__(self, d: LogPMF[str]) -> list[ScoredItem[str]]:
 36        """
 37        `__call__` is an alias for `score`.
 38        """
 39        return self.score(d)
 40
 41    def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]:
 42        """
 43        Process a `decoding.pmf.LogPMF` instance and returns a list
 44        of `decoding.pmf.ScoredItem` instances that are sorted by their scores.
 45
 46        Args:
 47            d: A `decoding.pmf.LogPMF` instance.
 48
 49        Returns:
 50            A list of `decoding.pmf.ScoredItem` instances that are sorted
 51            by their scores.
 52
 53        Example:
 54            ```python
 55            from decoding.pmf import LogPMF
 56            from decoding.scorers import Scorer
 57
 58            scorer = Scorer.from_f_str_to_num(lambda x: len(x))
 59            d = LogPMF.from_samples(["a", "bb", "ccc"])
 60            samples = scorer(d)
 61            assert samples[0].item == "ccc"
 62            assert samples[0].score == 3
 63            ```
 64
 65        """
 66        return sort_scored_items(self._f(d))
 67
 68    @classmethod
 69    def from_f_str_to_num(
 70        cls, f: Callable[[str], NUM], *, parallelize: bool = False
 71    ) -> "Scorer":
 72        """
 73        Construct a `Scorer` object from a function that maps a string to a
 74        number. The `Scorer` object will then score a
 75        `decoding.pmf.LogPMF` instance by applying this function to
 76        each of its categories.
 77
 78        Args:
 79            f: A function that maps a string to a number.
 80            parallelize: A boolean indicating whether to parallelize
 81                the scoring process.
 82
 83        Returns:
 84            A `Scorer` object.
 85
 86        Example:
 87            ```python
 88            from decoding.pmf import LogPMF
 89            from decoding.scorers import Scorer
 90
 91            scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True)
 92            d = LogPMF.from_samples(["a", "bb", "ccc"])
 93            samples = scorer(d)
 94            assert samples[-1].item == "a"
 95            assert samples[-1].score == 1
 96            ```
 97
 98        """
 99
100        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
101            if parallelize:
102                with ThreadPoolExecutor() as e:
103                    utilities = list(e.map(f, d.items))
104            else:
105                utilities = list(map(f, d.items))
106            return make_scored_items(d.items, utilities)
107
108        return cls(_f=_f)
109
110    @classmethod
111    def from_f_batch_str_to_batch_num(
112        cls, f: Callable[[Sequence[str]], Sequence[NUM]]
113    ) -> "Scorer":
114        """
115        Construct a `Scorer` object from a function that maps a sequence of
116        strings to a sequence of numbers. The `Scorer` object will then score
117        a `decoding.pmf.LogPMF` instance by applying this function
118        to its categories.
119
120        Args:
121            f: A function that maps a sequence of strings to a sequence of numbers.
122
123        Returns:
124            A `Scorer` object.
125
126        Example:
127            ```python
128            from decoding.pmf import LogPMF
129            from decoding.scorers import Scorer
130
131            scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x])
132            d = LogPMF.from_samples(["a", "bb", "ccc"])
133            samples = scorer(d)
134            assert samples[0].item == "ccc"
135            assert samples[0].score == 3
136            ```
137
138        """
139
140        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
141            utilities = f(d.items)
142            return make_scored_items(d.items, utilities)
143
144        return cls(_f=_f)
145
146    @classmethod
147    def from_f_logpmf_to_batch_num(
148        cls, f: Callable[[LogPMF[str]], Sequence[NUM]]
149    ) -> "Scorer":
150        """
151        Construct a `Scorer` object from a function that maps a
152        `decoding.pmf.LogPMF` instance to a sequence of numbers.
153        The `Scorer` object will then score the
154        `decoding.pmf.LogPMF` instance directly.
155
156        Args:
157            f: A function that maps a `decoding.pmf.LogPMF`
158                instance to a sequence of numbers.
159
160        Returns:
161            A `Scorer` object.
162
163        Example:
164            ```python
165            import jax.numpy as jnp
166            from decoding.pmf import LogPMF
167            from decoding.scorers import Scorer
168
169            f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d]
170            scorer = Scorer.from_f_logpmf_to_batch_num(f)
171            d = LogPMF.from_samples(["a", "bb", "bb", "ccc"])
172            samples = scorer(d)
173            assert samples[0].item == "bb"
174            assert samples[0].score == 1.0
175            assert samples[1].item == "ccc"
176            assert samples[1].score == 0.75
177            assert samples[2].item == "a"
178            assert samples[2].score == 0.25
179            ```
180
181        """
182
183        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
184            utilities = f(d)
185            return make_scored_items(d.items, utilities)
186
187        return cls(_f=_f)
188
189    @classmethod
190    def from_f_str_to_item(
191        cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False
192    ) -> "Scorer":
193        """
194        Construct a `Scorer` object from a function that maps a string to a
195        `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a
196        `decoding.pmf.LogPMF` instance by applying this function to
197        each of its categories. This allows us to update not only the score
198        values but also the items themselves.
199
200        Args:
201            f: A function that maps a string to a `decoding.pmf.ScoredItem` instance.
202            parallelize: A boolean indicating whether to parallelize
203                the scoring process.
204
205        Returns:
206            A `Scorer` object.
207
208        Example:
209            ```python
210            from decoding.pmf import LogPMF, ScoredItem
211            from decoding.scorers import Scorer
212
213            def f(x):
214                if x.endswith("."):
215                    return ScoredItem(item=x[:-1], score=len(x)-1)
216                return ScoredItem(item=x, score=len(x))
217
218            scorer = Scorer.from_f_str_to_item(f, parallelize=True)
219            d = LogPMF.from_samples(["a", "bb.", "ccc"])
220            samples = scorer(d)
221            assert samples[0].item == "ccc"
222            assert samples[0].score == 3
223            assert samples[1].item == "bb"
224            assert samples[1].score == 2
225            ```
226
227        """
228
229        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
230            if parallelize:
231                with ThreadPoolExecutor() as e:
232                    return list(e.map(f, d.items))
233            else:
234                return list(map(f, d.items))
235
236        return cls(_f=_f)
237
238    @classmethod
239    def from_f_batch_str_to_batch_item(
240        cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]]
241    ) -> "Scorer":
242        """
243        Construct a `Scorer` object from a function that maps a sequence of strings
244        to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will
245        then score a `decoding.pmf.LogPMF` instance by applying this function
246        to its categories. This allows us to update not only the score values but
247        also the items themselves.
248
249        Args:
250            f: A function that maps a sequence of strings to a sequence of
251                `decoding.pmf.ScoredItem` instances.
252
253        Returns:
254            A `Scorer` object.
255
256        Example:
257            ```python
258            from decoding.pmf import LogPMF, ScoredItem
259            from decoding.scorers import Scorer
260
261            f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs]
262            scorer = Scorer.from_f_batch_str_to_batch_item(f)
263            d = LogPMF.from_samples(["_a", "_bb", "_ccc"])
264            samples = scorer(d)
265            assert samples[0].item == "ccc"
266            assert samples[0].score == 3
267            ```
268
269        """
270
271        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
272            return list(f(d.items))
273
274        return cls(_f=_f)
275
276    @classmethod
277    def from_f_logpmf_to_batch_item(
278        cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]]
279    ) -> "Scorer":
280        """
281        Construct a `Scorer` object from a function that maps a
282        `decoding.pmf.LogPMF` instance to a sequence of
283        `decoding.pmf.ScoredItem` instances. This type signature actually
284        matches much of the `decoding.estimators` module, so this constructor
285        is particularly useful for building `Scorer` instances based on
286        `decoding.estimators.MBR`, etc.
287
288        Args:
289            f: A function that maps a `decoding.pmf.LogPMF`
290                instance to a sequence of `decoding.pmf.ScoredItem` instances.
291
292        Returns:
293            A `Scorer` object.
294
295        Example:
296            ```python
297            import jax.numpy as jnp
298            from decoding.estimators import MBR
299            from decoding.pmf import LogPMF
300            from decoding.scorers import Scorer
301
302            f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2)
303            scorer = Scorer.from_f_logpmf_to_batch_item(f)
304            d = LogPMF.from_samples(["aa", "bb", "cc"])
305            samples = scorer(d)
306            assert samples[0].item == "aa"
307            assert jnp.isclose(samples[0].score, 2/3)
308            assert samples[1].item == "bb"
309            assert jnp.isclose(samples[1].score, 1/3)
310            ```
311
312        """
313
314        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
315            return list(f(d))
316
317        return cls(_f=_f)
@dataclass(frozen=True, kw_only=True)
class Scorer:
 28@dataclass(frozen=True, kw_only=True)
 29class Scorer:
 30    """
 31    The `Scorer` class wraps and coordinates user-supplied scoring functions.
 32    """
 33
 34    _f: Callable[[LogPMF[str]], list[ScoredItem[str]]]
 35
 36    def __call__(self, d: LogPMF[str]) -> list[ScoredItem[str]]:
 37        """
 38        `__call__` is an alias for `score`.
 39        """
 40        return self.score(d)
 41
 42    def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]:
 43        """
 44        Process a `decoding.pmf.LogPMF` instance and returns a list
 45        of `decoding.pmf.ScoredItem` instances that are sorted by their scores.
 46
 47        Args:
 48            d: A `decoding.pmf.LogPMF` instance.
 49
 50        Returns:
 51            A list of `decoding.pmf.ScoredItem` instances that are sorted
 52            by their scores.
 53
 54        Example:
 55            ```python
 56            from decoding.pmf import LogPMF
 57            from decoding.scorers import Scorer
 58
 59            scorer = Scorer.from_f_str_to_num(lambda x: len(x))
 60            d = LogPMF.from_samples(["a", "bb", "ccc"])
 61            samples = scorer(d)
 62            assert samples[0].item == "ccc"
 63            assert samples[0].score == 3
 64            ```
 65
 66        """
 67        return sort_scored_items(self._f(d))
 68
 69    @classmethod
 70    def from_f_str_to_num(
 71        cls, f: Callable[[str], NUM], *, parallelize: bool = False
 72    ) -> "Scorer":
 73        """
 74        Construct a `Scorer` object from a function that maps a string to a
 75        number. The `Scorer` object will then score a
 76        `decoding.pmf.LogPMF` instance by applying this function to
 77        each of its categories.
 78
 79        Args:
 80            f: A function that maps a string to a number.
 81            parallelize: A boolean indicating whether to parallelize
 82                the scoring process.
 83
 84        Returns:
 85            A `Scorer` object.
 86
 87        Example:
 88            ```python
 89            from decoding.pmf import LogPMF
 90            from decoding.scorers import Scorer
 91
 92            scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True)
 93            d = LogPMF.from_samples(["a", "bb", "ccc"])
 94            samples = scorer(d)
 95            assert samples[-1].item == "a"
 96            assert samples[-1].score == 1
 97            ```
 98
 99        """
100
101        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
102            if parallelize:
103                with ThreadPoolExecutor() as e:
104                    utilities = list(e.map(f, d.items))
105            else:
106                utilities = list(map(f, d.items))
107            return make_scored_items(d.items, utilities)
108
109        return cls(_f=_f)
110
111    @classmethod
112    def from_f_batch_str_to_batch_num(
113        cls, f: Callable[[Sequence[str]], Sequence[NUM]]
114    ) -> "Scorer":
115        """
116        Construct a `Scorer` object from a function that maps a sequence of
117        strings to a sequence of numbers. The `Scorer` object will then score
118        a `decoding.pmf.LogPMF` instance by applying this function
119        to its categories.
120
121        Args:
122            f: A function that maps a sequence of strings to a sequence of numbers.
123
124        Returns:
125            A `Scorer` object.
126
127        Example:
128            ```python
129            from decoding.pmf import LogPMF
130            from decoding.scorers import Scorer
131
132            scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x])
133            d = LogPMF.from_samples(["a", "bb", "ccc"])
134            samples = scorer(d)
135            assert samples[0].item == "ccc"
136            assert samples[0].score == 3
137            ```
138
139        """
140
141        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
142            utilities = f(d.items)
143            return make_scored_items(d.items, utilities)
144
145        return cls(_f=_f)
146
147    @classmethod
148    def from_f_logpmf_to_batch_num(
149        cls, f: Callable[[LogPMF[str]], Sequence[NUM]]
150    ) -> "Scorer":
151        """
152        Construct a `Scorer` object from a function that maps a
153        `decoding.pmf.LogPMF` instance to a sequence of numbers.
154        The `Scorer` object will then score the
155        `decoding.pmf.LogPMF` instance directly.
156
157        Args:
158            f: A function that maps a `decoding.pmf.LogPMF`
159                instance to a sequence of numbers.
160
161        Returns:
162            A `Scorer` object.
163
164        Example:
165            ```python
166            import jax.numpy as jnp
167            from decoding.pmf import LogPMF
168            from decoding.scorers import Scorer
169
170            f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d]
171            scorer = Scorer.from_f_logpmf_to_batch_num(f)
172            d = LogPMF.from_samples(["a", "bb", "bb", "ccc"])
173            samples = scorer(d)
174            assert samples[0].item == "bb"
175            assert samples[0].score == 1.0
176            assert samples[1].item == "ccc"
177            assert samples[1].score == 0.75
178            assert samples[2].item == "a"
179            assert samples[2].score == 0.25
180            ```
181
182        """
183
184        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
185            utilities = f(d)
186            return make_scored_items(d.items, utilities)
187
188        return cls(_f=_f)
189
190    @classmethod
191    def from_f_str_to_item(
192        cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False
193    ) -> "Scorer":
194        """
195        Construct a `Scorer` object from a function that maps a string to a
196        `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a
197        `decoding.pmf.LogPMF` instance by applying this function to
198        each of its categories. This allows us to update not only the score
199        values but also the items themselves.
200
201        Args:
202            f: A function that maps a string to a `decoding.pmf.ScoredItem` instance.
203            parallelize: A boolean indicating whether to parallelize
204                the scoring process.
205
206        Returns:
207            A `Scorer` object.
208
209        Example:
210            ```python
211            from decoding.pmf import LogPMF, ScoredItem
212            from decoding.scorers import Scorer
213
214            def f(x):
215                if x.endswith("."):
216                    return ScoredItem(item=x[:-1], score=len(x)-1)
217                return ScoredItem(item=x, score=len(x))
218
219            scorer = Scorer.from_f_str_to_item(f, parallelize=True)
220            d = LogPMF.from_samples(["a", "bb.", "ccc"])
221            samples = scorer(d)
222            assert samples[0].item == "ccc"
223            assert samples[0].score == 3
224            assert samples[1].item == "bb"
225            assert samples[1].score == 2
226            ```
227
228        """
229
230        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
231            if parallelize:
232                with ThreadPoolExecutor() as e:
233                    return list(e.map(f, d.items))
234            else:
235                return list(map(f, d.items))
236
237        return cls(_f=_f)
238
239    @classmethod
240    def from_f_batch_str_to_batch_item(
241        cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]]
242    ) -> "Scorer":
243        """
244        Construct a `Scorer` object from a function that maps a sequence of strings
245        to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will
246        then score a `decoding.pmf.LogPMF` instance by applying this function
247        to its categories. This allows us to update not only the score values but
248        also the items themselves.
249
250        Args:
251            f: A function that maps a sequence of strings to a sequence of
252                `decoding.pmf.ScoredItem` instances.
253
254        Returns:
255            A `Scorer` object.
256
257        Example:
258            ```python
259            from decoding.pmf import LogPMF, ScoredItem
260            from decoding.scorers import Scorer
261
262            f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs]
263            scorer = Scorer.from_f_batch_str_to_batch_item(f)
264            d = LogPMF.from_samples(["_a", "_bb", "_ccc"])
265            samples = scorer(d)
266            assert samples[0].item == "ccc"
267            assert samples[0].score == 3
268            ```
269
270        """
271
272        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
273            return list(f(d.items))
274
275        return cls(_f=_f)
276
277    @classmethod
278    def from_f_logpmf_to_batch_item(
279        cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]]
280    ) -> "Scorer":
281        """
282        Construct a `Scorer` object from a function that maps a
283        `decoding.pmf.LogPMF` instance to a sequence of
284        `decoding.pmf.ScoredItem` instances. This type signature actually
285        matches much of the `decoding.estimators` module, so this constructor
286        is particularly useful for building `Scorer` instances based on
287        `decoding.estimators.MBR`, etc.
288
289        Args:
290            f: A function that maps a `decoding.pmf.LogPMF`
291                instance to a sequence of `decoding.pmf.ScoredItem` instances.
292
293        Returns:
294            A `Scorer` object.
295
296        Example:
297            ```python
298            import jax.numpy as jnp
299            from decoding.estimators import MBR
300            from decoding.pmf import LogPMF
301            from decoding.scorers import Scorer
302
303            f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2)
304            scorer = Scorer.from_f_logpmf_to_batch_item(f)
305            d = LogPMF.from_samples(["aa", "bb", "cc"])
306            samples = scorer(d)
307            assert samples[0].item == "aa"
308            assert jnp.isclose(samples[0].score, 2/3)
309            assert samples[1].item == "bb"
310            assert jnp.isclose(samples[1].score, 1/3)
311            ```
312
313        """
314
315        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
316            return list(f(d))
317
318        return cls(_f=_f)

The Scorer class wraps and coordinates user-supplied scoring functions.

Scorer( *, _f: Callable[[decoding.pmf.LogPMF[str]], list[decoding.pmf.ScoredItem[str]]])
def score(self, d: decoding.pmf.LogPMF[str]) -> list[decoding.pmf.ScoredItem[str]]:
42    def score(self, d: LogPMF[str]) -> list[ScoredItem[str]]:
43        """
44        Process a `decoding.pmf.LogPMF` instance and returns a list
45        of `decoding.pmf.ScoredItem` instances that are sorted by their scores.
46
47        Args:
48            d: A `decoding.pmf.LogPMF` instance.
49
50        Returns:
51            A list of `decoding.pmf.ScoredItem` instances that are sorted
52            by their scores.
53
54        Example:
55            ```python
56            from decoding.pmf import LogPMF
57            from decoding.scorers import Scorer
58
59            scorer = Scorer.from_f_str_to_num(lambda x: len(x))
60            d = LogPMF.from_samples(["a", "bb", "ccc"])
61            samples = scorer(d)
62            assert samples[0].item == "ccc"
63            assert samples[0].score == 3
64            ```
65
66        """
67        return sort_scored_items(self._f(d))

Process a decoding.pmf.LogPMF instance and returns a list of decoding.pmf.ScoredItem instances that are sorted by their scores.

Arguments:
Returns:

A list of decoding.pmf.ScoredItem instances that are sorted by their scores.

Example:
from decoding.pmf import LogPMF
from decoding.scorers import Scorer

scorer = Scorer.from_f_str_to_num(lambda x: len(x))
d = LogPMF.from_samples(["a", "bb", "ccc"])
samples = scorer(d)
assert samples[0].item == "ccc"
assert samples[0].score == 3
@classmethod
def from_f_str_to_num( cls, f: Callable[[str], float | int | jaxtyping.Float[Array, ''] | jaxtyping.Int[Array, '']], *, parallelize: bool = False) -> Scorer:
 69    @classmethod
 70    def from_f_str_to_num(
 71        cls, f: Callable[[str], NUM], *, parallelize: bool = False
 72    ) -> "Scorer":
 73        """
 74        Construct a `Scorer` object from a function that maps a string to a
 75        number. The `Scorer` object will then score a
 76        `decoding.pmf.LogPMF` instance by applying this function to
 77        each of its categories.
 78
 79        Args:
 80            f: A function that maps a string to a number.
 81            parallelize: A boolean indicating whether to parallelize
 82                the scoring process.
 83
 84        Returns:
 85            A `Scorer` object.
 86
 87        Example:
 88            ```python
 89            from decoding.pmf import LogPMF
 90            from decoding.scorers import Scorer
 91
 92            scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True)
 93            d = LogPMF.from_samples(["a", "bb", "ccc"])
 94            samples = scorer(d)
 95            assert samples[-1].item == "a"
 96            assert samples[-1].score == 1
 97            ```
 98
 99        """
100
101        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
102            if parallelize:
103                with ThreadPoolExecutor() as e:
104                    utilities = list(e.map(f, d.items))
105            else:
106                utilities = list(map(f, d.items))
107            return make_scored_items(d.items, utilities)
108
109        return cls(_f=_f)

Construct a Scorer object from a function that maps a string to a number. The Scorer object will then score a decoding.pmf.LogPMF instance by applying this function to each of its categories.

Arguments:
  • f: A function that maps a string to a number.
  • parallelize: A boolean indicating whether to parallelize the scoring process.
Returns:

A Scorer object.

Example:
from decoding.pmf import LogPMF
from decoding.scorers import Scorer

scorer = Scorer.from_f_str_to_num(lambda x: len(x), parallelize=True)
d = LogPMF.from_samples(["a", "bb", "ccc"])
samples = scorer(d)
assert samples[-1].item == "a"
assert samples[-1].score == 1
@classmethod
def from_f_batch_str_to_batch_num( cls, f: Callable[[Sequence[str]], Sequence[float | int | jaxtyping.Float[Array, ''] | jaxtyping.Int[Array, '']]]) -> Scorer:
111    @classmethod
112    def from_f_batch_str_to_batch_num(
113        cls, f: Callable[[Sequence[str]], Sequence[NUM]]
114    ) -> "Scorer":
115        """
116        Construct a `Scorer` object from a function that maps a sequence of
117        strings to a sequence of numbers. The `Scorer` object will then score
118        a `decoding.pmf.LogPMF` instance by applying this function
119        to its categories.
120
121        Args:
122            f: A function that maps a sequence of strings to a sequence of numbers.
123
124        Returns:
125            A `Scorer` object.
126
127        Example:
128            ```python
129            from decoding.pmf import LogPMF
130            from decoding.scorers import Scorer
131
132            scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x])
133            d = LogPMF.from_samples(["a", "bb", "ccc"])
134            samples = scorer(d)
135            assert samples[0].item == "ccc"
136            assert samples[0].score == 3
137            ```
138
139        """
140
141        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
142            utilities = f(d.items)
143            return make_scored_items(d.items, utilities)
144
145        return cls(_f=_f)

Construct a Scorer object from a function that maps a sequence of strings to a sequence of numbers. The Scorer object will then score a decoding.pmf.LogPMF instance by applying this function to its categories.

Arguments:
  • f: A function that maps a sequence of strings to a sequence of numbers.
Returns:

A Scorer object.

Example:
from decoding.pmf import LogPMF
from decoding.scorers import Scorer

scorer = Scorer.from_f_batch_str_to_batch_num(lambda x: [len(s) for s in x])
d = LogPMF.from_samples(["a", "bb", "ccc"])
samples = scorer(d)
assert samples[0].item == "ccc"
assert samples[0].score == 3
@classmethod
def from_f_logpmf_to_batch_num( cls, f: Callable[[decoding.pmf.LogPMF[str]], Sequence[float | int | jaxtyping.Float[Array, ''] | jaxtyping.Int[Array, '']]]) -> Scorer:
147    @classmethod
148    def from_f_logpmf_to_batch_num(
149        cls, f: Callable[[LogPMF[str]], Sequence[NUM]]
150    ) -> "Scorer":
151        """
152        Construct a `Scorer` object from a function that maps a
153        `decoding.pmf.LogPMF` instance to a sequence of numbers.
154        The `Scorer` object will then score the
155        `decoding.pmf.LogPMF` instance directly.
156
157        Args:
158            f: A function that maps a `decoding.pmf.LogPMF`
159                instance to a sequence of numbers.
160
161        Returns:
162            A `Scorer` object.
163
164        Example:
165            ```python
166            import jax.numpy as jnp
167            from decoding.pmf import LogPMF
168            from decoding.scorers import Scorer
169
170            f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d]
171            scorer = Scorer.from_f_logpmf_to_batch_num(f)
172            d = LogPMF.from_samples(["a", "bb", "bb", "ccc"])
173            samples = scorer(d)
174            assert samples[0].item == "bb"
175            assert samples[0].score == 1.0
176            assert samples[1].item == "ccc"
177            assert samples[1].score == 0.75
178            assert samples[2].item == "a"
179            assert samples[2].score == 0.25
180            ```
181
182        """
183
184        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
185            utilities = f(d)
186            return make_scored_items(d.items, utilities)
187
188        return cls(_f=_f)

Construct a Scorer object from a function that maps a decoding.pmf.LogPMF instance to a sequence of numbers. The Scorer object will then score the decoding.pmf.LogPMF instance directly.

Arguments:
Returns:

A Scorer object.

Example:
import jax.numpy as jnp
from decoding.pmf import LogPMF
from decoding.scorers import Scorer

f = lambda d: [jnp.exp(logp) * len(item) for logp, item in d]
scorer = Scorer.from_f_logpmf_to_batch_num(f)
d = LogPMF.from_samples(["a", "bb", "bb", "ccc"])
samples = scorer(d)
assert samples[0].item == "bb"
assert samples[0].score == 1.0
assert samples[1].item == "ccc"
assert samples[1].score == 0.75
assert samples[2].item == "a"
assert samples[2].score == 0.25
@classmethod
def from_f_str_to_item( cls, f: Callable[[str], decoding.pmf.ScoredItem[str]], *, parallelize: bool = False) -> Scorer:
190    @classmethod
191    def from_f_str_to_item(
192        cls, f: Callable[[str], ScoredItem[str]], *, parallelize: bool = False
193    ) -> "Scorer":
194        """
195        Construct a `Scorer` object from a function that maps a string to a
196        `decoding.pmf.ScoredItem` instance. The `Scorer` object will then score a
197        `decoding.pmf.LogPMF` instance by applying this function to
198        each of its categories. This allows us to update not only the score
199        values but also the items themselves.
200
201        Args:
202            f: A function that maps a string to a `decoding.pmf.ScoredItem` instance.
203            parallelize: A boolean indicating whether to parallelize
204                the scoring process.
205
206        Returns:
207            A `Scorer` object.
208
209        Example:
210            ```python
211            from decoding.pmf import LogPMF, ScoredItem
212            from decoding.scorers import Scorer
213
214            def f(x):
215                if x.endswith("."):
216                    return ScoredItem(item=x[:-1], score=len(x)-1)
217                return ScoredItem(item=x, score=len(x))
218
219            scorer = Scorer.from_f_str_to_item(f, parallelize=True)
220            d = LogPMF.from_samples(["a", "bb.", "ccc"])
221            samples = scorer(d)
222            assert samples[0].item == "ccc"
223            assert samples[0].score == 3
224            assert samples[1].item == "bb"
225            assert samples[1].score == 2
226            ```
227
228        """
229
230        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
231            if parallelize:
232                with ThreadPoolExecutor() as e:
233                    return list(e.map(f, d.items))
234            else:
235                return list(map(f, d.items))
236
237        return cls(_f=_f)

Construct a Scorer object from a function that maps a string to a decoding.pmf.ScoredItem instance. The Scorer object will then score a decoding.pmf.LogPMF instance by applying this function to each of its categories. This allows us to update not only the score values but also the items themselves.

Arguments:
  • f: A function that maps a string to a decoding.pmf.ScoredItem instance.
  • parallelize: A boolean indicating whether to parallelize the scoring process.
Returns:

A Scorer object.

Example:
from decoding.pmf import LogPMF, ScoredItem
from decoding.scorers import Scorer

def f(x):
    if x.endswith("."):
        return ScoredItem(item=x[:-1], score=len(x)-1)
    return ScoredItem(item=x, score=len(x))

scorer = Scorer.from_f_str_to_item(f, parallelize=True)
d = LogPMF.from_samples(["a", "bb.", "ccc"])
samples = scorer(d)
assert samples[0].item == "ccc"
assert samples[0].score == 3
assert samples[1].item == "bb"
assert samples[1].score == 2
@classmethod
def from_f_batch_str_to_batch_item( cls, f: Callable[[Sequence[str]], Sequence[decoding.pmf.ScoredItem[str]]]) -> Scorer:
239    @classmethod
240    def from_f_batch_str_to_batch_item(
241        cls, f: Callable[[Sequence[str]], Sequence[ScoredItem[str]]]
242    ) -> "Scorer":
243        """
244        Construct a `Scorer` object from a function that maps a sequence of strings
245        to a sequence of `decoding.pmf.ScoredItem` instances. The `Scorer` object will
246        then score a `decoding.pmf.LogPMF` instance by applying this function
247        to its categories. This allows us to update not only the score values but
248        also the items themselves.
249
250        Args:
251            f: A function that maps a sequence of strings to a sequence of
252                `decoding.pmf.ScoredItem` instances.
253
254        Returns:
255            A `Scorer` object.
256
257        Example:
258            ```python
259            from decoding.pmf import LogPMF, ScoredItem
260            from decoding.scorers import Scorer
261
262            f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs]
263            scorer = Scorer.from_f_batch_str_to_batch_item(f)
264            d = LogPMF.from_samples(["_a", "_bb", "_ccc"])
265            samples = scorer(d)
266            assert samples[0].item == "ccc"
267            assert samples[0].score == 3
268            ```
269
270        """
271
272        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
273            return list(f(d.items))
274
275        return cls(_f=_f)

Construct a Scorer object from a function that maps a sequence of strings to a sequence of decoding.pmf.ScoredItem instances. The Scorer object will then score a decoding.pmf.LogPMF instance by applying this function to its categories. This allows us to update not only the score values but also the items themselves.

Arguments:
Returns:

A Scorer object.

Example:
from decoding.pmf import LogPMF, ScoredItem
from decoding.scorers import Scorer

f = lambda xs: [ScoredItem(item=x[1:], score=len(x[1:])) for x in xs]
scorer = Scorer.from_f_batch_str_to_batch_item(f)
d = LogPMF.from_samples(["_a", "_bb", "_ccc"])
samples = scorer(d)
assert samples[0].item == "ccc"
assert samples[0].score == 3
@classmethod
def from_f_logpmf_to_batch_item( cls, f: Callable[[decoding.pmf.LogPMF[str]], Sequence[decoding.pmf.ScoredItem[str]]]) -> Scorer:
277    @classmethod
278    def from_f_logpmf_to_batch_item(
279        cls, f: Callable[[LogPMF[str]], Sequence[ScoredItem[str]]]
280    ) -> "Scorer":
281        """
282        Construct a `Scorer` object from a function that maps a
283        `decoding.pmf.LogPMF` instance to a sequence of
284        `decoding.pmf.ScoredItem` instances. This type signature actually
285        matches much of the `decoding.estimators` module, so this constructor
286        is particularly useful for building `Scorer` instances based on
287        `decoding.estimators.MBR`, etc.
288
289        Args:
290            f: A function that maps a `decoding.pmf.LogPMF`
291                instance to a sequence of `decoding.pmf.ScoredItem` instances.
292
293        Returns:
294            A `Scorer` object.
295
296        Example:
297            ```python
298            import jax.numpy as jnp
299            from decoding.estimators import MBR
300            from decoding.pmf import LogPMF
301            from decoding.scorers import Scorer
302
303            f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2)
304            scorer = Scorer.from_f_logpmf_to_batch_item(f)
305            d = LogPMF.from_samples(["aa", "bb", "cc"])
306            samples = scorer(d)
307            assert samples[0].item == "aa"
308            assert jnp.isclose(samples[0].score, 2/3)
309            assert samples[1].item == "bb"
310            assert jnp.isclose(samples[1].score, 1/3)
311            ```
312
313        """
314
315        def _f(d: LogPMF[str]) -> list[ScoredItem[str]]:
316            return list(f(d))
317
318        return cls(_f=_f)

Construct a Scorer object from a function that maps a decoding.pmf.LogPMF instance to a sequence of decoding.pmf.ScoredItem instances. This type signature actually matches much of the decoding.estimators module, so this constructor is particularly useful for building Scorer instances based on decoding.estimators.MBR, etc.

Arguments:
Returns:

A Scorer object.

Example:
import jax.numpy as jnp
from decoding.estimators import MBR
from decoding.pmf import LogPMF
from decoding.scorers import Scorer

f = lambda d: MBR(d, utility=lambda x1, x2: x1 < x2)
scorer = Scorer.from_f_logpmf_to_batch_item(f)
d = LogPMF.from_samples(["aa", "bb", "cc"])
samples = scorer(d)
assert samples[0].item == "aa"
assert jnp.isclose(samples[0].score, 2/3)
assert samples[1].item == "bb"
assert jnp.isclose(samples[1].score, 1/3)