Replies: 4 comments 10 replies
-
Hello lambeq-dev team, However, once the model is loaded from a checkpoint with those weights, what does it do exactly when I feed it with unseen sentences with possibly some unseen word ? |
Beta Was this translation helpful? Give feedback.
-
Hi, it is correct that, in the example notebooks, we extract the parameters from all the diagrams of the dataset. Due to the small vocabulary size of the example tasks, it is highly unlikely that the validation or test set contain tokens that do not appear in the training set. However, if this very unlikely event occurs, we make sure that there's a parameter stored for each token, even if it's only initialised at random. Otherwise, the model would raise an exception during evaluation ("token not found, etc.)> In reality, we don't have such a well-defined dataset or corpus and it is very likely, that the model has to deal with unknown words. This can be achieved by replacing rare words in the dataset/corpus with a dummy "< unk >" token. During testing/ inference we then replace any unseen word with the same "< unk >" token to avoid that the model raises an exception. Be aware that for a syntax-based model like DisCoCat, the "< unk >" token will be word-type dependent, which makes the whole thing a little trickier. I hope this helps. |
Beta Was this translation helpful? Give feedback.
-
Hi, no problem. Imagine your dataset consists of only one diagram which represents the sentence "Alice loves Bob". You apply a circuit ansatz and your circuit might look like this (the image is a bit cropped, sorry): You can see that the circuit contains 8 symbolic parameters ( First of all, you initialise the model = TketModel.from_diagrams([alice_loves_bob]) The lambeq/lambeq/training/model.py Lines 146 to 148 in d911686 After that, we can call the method TketModel.initialise_weights() This method initialises a random value per symbol, stored in a list under the attribute lambeq/lambeq/training/quantum_model.py Lines 90 to 94 in d911686 To evaluate the circuit, we now need to "lambdify". "Lambdifying" is a term that comes from symbolic python, which means converting a symbolic expression into an lambda function that takes the concrete values as arguments. Consider this example: >>> from sympy.abc import x
>>> from sympy.utilities.lambdify import lambdify, implemented_function
>>> f = implemented_function('f', lambda x: x+1)
>>> lam_f = lambdify(x, f(x))
>>> lam_f(4)
5 The same thing happens with the circuit in the model's method lambeq/lambeq/training/tket_model.py Lines 69 to 72 in d911686 We create a lambda function by passing all possible symbols to lambeq/lambeq/training/tket_model.py Lines 105 to 108 in d911686 Defining a loss and estimating some gradient let's us now tune the values stored in |
Beta Was this translation helpful? Give feedback.
-
By default, yes this is true. However, you can always adapt a model to your needs by creating a custom model that inherits from Keep in mind, that the problem of how to deal with unknown words is apparent in all NLP, and is not lambeq specific. There are many different approaches to tackle this and we don't want to limit the users choices by implementing a default way of doing it. We've designed the models having research applications in mind, where normally well-defined validation and test sets are available. However, you are right that we might want to change some things with the current interface. At the moment, we are passing However, at the moment we leave it to the user to adapt the predefined models to their needs by creating sub classes. |
Beta Was this translation helpful? Give feedback.
-
Hello, I'm not sure I entirely understood how the weights of
QuantumModel
work.Analyzing the source code, I figured that the model's weights are the unique symbols of the parametrized quantum circuit and they're initialized calling the
from_diagram
method. However, in the example shown in the documentation, you pass to that method both the training and the validation circuits, so that you eventually have those parameters as well in your model. Is this correct ?If it is the case, I don't understand how can the model can be used on unseen data and which weights are loaded if the model is loaded from a checkpoint.
Beta Was this translation helpful? Give feedback.
All reactions