-
Notifications
You must be signed in to change notification settings - Fork 0
swetharam/implementation-of-Ngrams
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Homework1: Develop a bigram probability of two sentences given via command line For running this code , you will have to give to the following commands in the command line: python homework1.py Corpus.txt �Sentence 1� �Sentence 2� NOTE: Put the homework1.py and the Corpus.txt in the same folder or give the relative path for both As per the rules, the string values should be given in double quotes and this both sentences must be given in double quotes and should be separated by a space I have used add one smoothing as the smoothing technique. SAMPLE OUTPUT: For sentence 1: Before Smoothing The following values are for bigrams of the first sentence: [[ 0 6 7 1 0 0 0 1 0] [ 0 0 0 0 0 0 0 0 0] [ 0 0 0 5 0 0 0 5 0] [23 0 0 0 28 1 0 0 43] [ 2 0 0 0 0 1 1 0 0] [ 0 0 0 0 0 0 0 0 0] [17 0 0 85 0 0 0 85 2] [23 0 0 0 28 1 0 0 43] [ 0 0 0 0 0 0 0 0 0]] The following values are the bigram probabilities of the first sentence: [[ 0. 0.01327434 0.01548673 0.00221239 0. 0. 0. 0.00221239 0. ] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ] [ 0. 0. 0. 0.09259259 0. 0. 0. 0.09259259 0. ] [ 0.03050398 0. 0. 0. 0.03713528 0.00132626 0. 0. 0.05702918] [ 0.03921569 0. 0. 0. 0. 0.01960784 0.01960784 0. 0. ] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ] [ 0.04545455 0. 0. 0.22727273 0. 0. 0. 0.22727273 0.00534759] [ 0.03050398 0. 0. 0. 0.03713528 0.00132626 0. 0. 0.05702918] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ]] Total Probability before smoothing is 0.446911955519 For sentence 1: After Smoothing The following values are for bigrams of the first sentence after performing smoothing on them: [[ 1 7 8 2 1 1 1 2 1] [ 1 1 1 1 1 1 1 1 1] [ 1 1 1 6 1 1 1 6 1] [24 1 1 1 29 2 1 1 44] [ 3 1 1 1 1 2 2 1 1] [ 1 1 1 1 1 1 1 1 1] [18 1 1 86 1 1 1 86 3] [24 1 1 1 29 2 1 1 44] [ 1 1 1 1 1 1 1 1 1]] The probabilities after performing smoothing on the data on sentence 1: [[ 0.00021725 0.00152075 0.001738 0.0004345 0.00021725 0.00021725 0.00021725 0.0004345 0.00021725] [ 0.00024044 0.00024044 0.00024044 0.00024044 0.00024044 0.00024044 0.00024044 0.00024044 0.00024044] [ 0.00023781 0.00023781 0.00023781 0.00142687 0.00023781 0.00023781 0.00023781 0.00142687 0.00023781] [ 0.00489297 0.00020387 0.00020387 0.00020387 0.00591233 0.00040775 0.00020387 0.00020387 0.00897044] [ 0.00071395 0.00023798 0.00023798 0.00023798 0.00023798 0.00047596 0.00047596 0.00023798 0.00023798] [ 0.00023912 0.00023912 0.00023912 0.00023912 0.00023912 0.00023912 0.00023912 0.00023912 0.00023912] [ 0.0039779 0.00022099 0.00022099 0.01900552 0.00022099 0.00022099 0.00022099 0.01900552 0.00066298] [ 0.00489297 0.00020387 0.00020387 0.00020387 0.00591233 0.00040775 0.00020387 0.00020387 0.00897044] [ 0.00023646 0.00023646 0.00023646 0.00023646 0.00023646 0.00023646 0.00023646 0.00023646 0.00023646]] Total Probability after smoothing is 0.0377914439311 For sentence 2: Before Smoothing The following values are for bigrams of the second sentence before smoothing: [[ 0 6 7 1 0 0 0 1 0] [ 0 0 0 0 0 0 0 0 0] [ 0 0 0 5 0 0 0 5 0] [23 0 0 0 28 1 0 0 43] [ 2 0 0 0 0 1 1 0 0] [ 0 0 0 0 0 0 0 0 0] [17 0 0 85 0 0 0 85 2] [23 0 0 0 28 1 0 0 43] [ 0 0 0 0 0 0 0 0 0]] The following values are the bigram probabilities of the second sentence: [[ 0. 0.01327434 0.01548673 0.00221239 0. 0. 0. 0.00221239 0. ] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ] [ 0. 0. 0. 0.0066313 0. 0. 0. 0.0066313 0. ] [ 0.47916667 0. 0. 0. 0.58333333 0.02083333 0. 0. 0.89583333] [ 0.2 0. 0. 0. 0. 0.1 0.1 0. 0. ] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ] [ 0.27419355 0. 0. 1.37096774 0. 0. 0. 1.37096774 0.03225806] [ 0.0777027 0. 0. 0. 0.09459459 0.00337838 0. 0. 0.14527027] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. ]] Total Probability before smoothing is 2.21947698156 For sentence 2: After Smoothing The following values are for bigrams of the second sentence after performing smoothing on them: [[ 1 7 8 2 1 1 1 2 1] [ 1 1 1 1 1 1 1 1 1] [ 1 1 1 6 1 1 1 6 1] [24 1 1 1 29 2 1 1 44] [ 3 1 1 1 1 2 2 1 1] [ 1 1 1 1 1 1 1 1 1] [18 1 1 86 1 1 1 86 3] [24 1 1 1 29 2 1 1 44] [ 1 1 1 1 1 1 1 1 1]] The probabilities after performing smoothing on the sentence 2: [[ 0.00021725 0.00152075 0.001738 0.0004345 0.00021725 0.00021725 0.00021725 0.0004345 0.00021725] [ 0.00023866 0.00023866 0.00023866 0.00023866 0.00023866 0.00023866 0.00023866 0.00023866 0.00023866] [ 0.00020387 0.00020387 0.00020387 0.00122324 0.00020387 0.00020387 0.00020387 0.00122324 0.00020387] [ 0.00571565 0.00023815 0.00023815 0.00023815 0.00690641 0.0004763 0.00023815 0.00023815 0.01047869] [ 0.00072098 0.00024033 0.00024033 0.00024033 0.00024033 0.00048065 0.00048065 0.00024033 0.00024033] [ 0.00022099 0.00022099 0.00022099 0.00022099 0.00022099 0.00022099 0.00022099 0.00022099 0.00022099] [ 0.00427249 0.00023736 0.00023736 0.02041301 0.00023736 0.00023736 0.00023736 0.02041301 0.00071208] [ 0.0053969 0.00022487 0.00022487 0.00022487 0.00652125 0.00044974 0.00022487 0.00022487 0.00989431] [ 0.00024027 0.00024027 0.00024027 0.00024027 0.00024027 0.00024027 0.00024027 0.00024027 0.00024027]] Total Probability after smoothing is 0.0408980249942
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published