-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation using test-set #12
Comments
Hi @h4y4h0o, I had the same problem, when I tried to input my own train and test files. The trick for me was first to run the data transformation with your test.csv and after that put the created ratings_binary.txt file as test data. Try the follwoing: Edit your settings.conf so that your test.csv is where you have your train.csv at the moment. Run CARSKIT and after the programm has finished datatransformation it will throw you an error.
Next you extract the ratings_binary.txt file from the output folder and place it somewhere else, for example next to your test.csv file. After doing this you can change your settings.conf again:
So this should do the trick. Let me know if that works for you :-) |
Thanks! The whole idea behind is that, you must make sure your training and
testing data have the same format. Either you should prepare them by
yourself, or you need to use the internal transformer to convert the data
to the correct format.
…On Tue, Oct 31, 2017 at 9:42 AM, MatthiasKirsch ***@***.***> wrote:
Hi @h4y4h0o <https://github.com/h4y4h0o>,
I had the same problem, when I tried to input my own train and test files.
The trick for me was first to run the data transformation with your
test.csv and after that put the created ratings_binary.txt file as test
data.
Try the follwoing: Edit your settings.conf so that your test.csv is where
you have your train.csv at the moment. Run CARSKIT and after the programm
has finished datatransformation it will throw you an error.
You can ignore this error because you only want to have the
ratings_binary.txt file created in your output folder.
dataset.ratings.wins=C:\test.csv
Next you extract the ratings_binary.txt file from the output folder and
place it somewhere else, for example next to your test.csv file. After
doing this you can change your settings.conf again:
dataset.ratings.wins=C:\train.csv
[...]
evaluation.setup=test-set -f C:\ratings_binary.txt
So this should do the trick. Let me know if that works for you :-)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#12 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHDB55fi5wIk7YY3y6ltemJot8D3o0e-ks5sxzHGgaJpZM4QMuUt>
.
|
Thank you for your responses. |
I have another question about the evaluation: |
Some algorithms are sensitive to initializations -- that's the readon why
On Tue, Nov 7, 2017 at 10:46 PM h4y4h0o ***@***.***> wrote:
I have another question about the evaluation:
I run multiple times an algorithm, in the same training and testing set,
but the evaluation results (MAE, RMSE, etc) are different each time!
I don't understand what does happen?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#12 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHDB5xhRU9mmtca3rfq29QMjMAOUJEo3ks5s0G00gaJpZM4QMuUt>
.
--
Sent from Gmail Mobile
|
OK. This issue was reported by several users. And I had this issue recently too. |
Hi,
I recently use CARSkit to compare some context-aware recommendation algorithms of the state of the art. I would like to evaluate them by supplying manually the training and testing set.
It creates the binary file, but I got the error "value already present: 0"
I checked and I don't have duplicate lines present in both train and test files.
What could be the problem?
Here is my config file:
dataset.ratings.wins=C:\train.csv
dataset.social.wins=-1
dataset.social.lins=-1
ratings.setup=-threshold 3 -datatransformation -1
recommender=camf_ci
evaluation.setup=test-set -f C:\testFile_0.csv
item.ranking=off -topN 10
output.setup=-folder CARSKit.Workspace -verbose on, off --to-file results.txt
guava.cache.spec=maximumSize=200,expireAfterAccess=2m
########## Model-based Methods ##########
num.factors=10
num.max.iter=100
learn.rate=2e-2 -max -1 -bold-driver
reg.lambda=0.0001 -c 0.001
pgm.setup=-alpha 2 -beta 0.5 -burn-in 300 -sample-lag 10 -interval 100
similarity=pcc
num.shrinkage=-1
num.neighbors=10
The error output:
java.lang.IllegalArgumentException: value already present: 0
at com.google.common.collect.HashBiMap.put(HashBiMap.java:238)
at com.google.common.collect.HashBiMap.put(HashBiMap.java:215)
at carskit.data.processor.DataDAO.readData(DataDAO.java:208)
at carskit.main.CARSKit.runAlgorithm(CARSKit.java:319)
at carskit.main.CARSKit.execute(CARSKit.java:121)
at carskit.main.CARSKit.main(CARSKit.java:93)
Thanks in advance for your help.
The text was updated successfully, but these errors were encountered: