Run ipynb 01-10 in sequence or train sh
01 get text(title,abstract,keyword,venue) embedding from tfidf,word2vec,chatglm3 and bge-m3 02 For each autherID, calculate the similarity between each pid and other pids 03 Extract strongly correlated information (co author, co-org, co-keyword...) 04 tree model 05 gnn model and post-processs