-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation for item recommendation #5
Comments
Hello, thanks for your interests! 1). You can use binary values. For example, 1 as user has interactions with item, and 0 as no interactions. it is viewed as ratings or probibilities for users to have interactions on items your data format is correct, but you should also have negative feedbacks in your data. 2). well, it depends on the setting. For example, if a user did not interact with an item in a spcific context, e.g., home, then he or she is possible to interact with it again in another context, like cinema. But, in our evaluations, we assume users will NOT interact with a same item within a same context in the past. You can change the evalRankings() by yourself to achieve your own goals. 3). My answers in part 1 and 2 can help you understand the outputs. In your data, the only rating is 1.0. In terms of the error below, you have duplicated interaction records, which is related to the answer in part 2. 2016-04-12 17:42:08,580 -- value already present: 0 java.lang.IllegalArgumentException: value already present: 0 at com.google.common.collect.HashBiMap.put(HashBiMap.java:238) at com.google.common.collect.HashBiMap.put(HashBiMap.java:215) at carskit.data.processor.DataDAO.readData(DataDAO.java:169) at carskit.main.CARSKit.runAlgorithm(CARSKit.java:317) at carskit.main.CARSKit.execute(CARSKit.java:115) at carskit.main.CARSKit.main(CARSKit.java:87) |
Thanks for your reply! I changed the input file and managed to make it work. It seems that there is some issue in the automatic conversion between a compact input and a binary input. For example, the compact version of the attached input files does not work (I get an error while reading the test file), while the (equivalent) binary input version works fine. Following your advice, I modified evalRankings() by removing the part that excludes the item that have already been ranked. It seems to work fine so far, I am running some test. |
Thanks for your information. I will double check the data convertion. |
Hello again, After some tinkering I am a bit unsure about the best strategy to adapt for generating the top-k recommendations using positive-only binary input for the rating. Checking the literature it appears to me that all of the context-aware algorithms are designed for the case where the input contains ratings. The case of positive-only binary input is trickier: I can't just set the rating to 0 for the user-item-context combinations for which I didn't observe an interaction, since it would penalise those items, assuming that a lack of interaction is a negative rating; in addition, the input file containing all the possible non-observed combinations would be enormous. I already have reasonable non-context-aware baselines (SLIM and BPR) that work correctly in this case. In your opinion, what can I do to get reasonable context-aware top-k items? |
Hello, the design in CARSKit actually follows the original desing in LibRec. So, there is a setting to set rating threshold so that you can bin the profiles to relevant and irrelevant ones. Well, in this case, you can view them as probabilities, for example, what is the probablity the user will like the items in specific contexts. In recommender systems, we'd like to recommender the items with predicted rating larger than a threshold, i.e., we do not want to recommend a book to a user, while this user may simply rate the book by a two star. I agree with you that the design is not that good. In my view, the best way is to only use the rating threshold in evaluation process. For example, we do not bin the profiles at the beginning, and simplyh use rating threshold when we recommend items to the users -- predicted rating less than the threshold will not be pushed to the user. If you like this idea, you can download a copy of the source codes and then change them accordingly. I will do some experiments and evaluations to examine which way is better. If the later one is better, I will update the library accordingly. Thanks for your comments and sorry for my late response. |
Hello, I have changed the evaluation for topN recommendation. If a rating threshold is set for topN recommendation, the threshold will not be used initially to bin the ratings to 0s and 1s. Instead, the threshold is only adopted in the evalRanking() process, where only the items with a larger predicted rating (than threshold) will be recommended. Thanks for your advice and suggestions! |
Just to clarify, I have positive-only input also, do I need to explicitly create data transactions with 0 rating or will the model automatically do that for me ? |
Hello, I think so, since there must be negative feedbacks to make it work |
Has anyone on this list used CARSKit in a distributed environment? If so, can you give a pointer of how to do this please? I'm thinking if CARSKit can be used in Spark. |
Hello and thanks for this very interesting piece of software.
I am trying to use it for item recommendation based on positive-only input (interactions between a user and an item).
I have some questions related to the format of the input and the algorithm:
I set all the ratings as 1, obtaining a compact format input like:
user, item, rating, location
1, 2, 1, home
2, 1, 1, home
1, 3, 1, work
...
Does the algorithm take care explicitly of the user-item negative interactions (i.e. when a user did not interact with an item)?
Is it possible to obtain this behaviour?
Just to add all the information, my config file is:
The text was updated successfully, but these errors were encountered: