Language: Golang-1.18
# Compile the code
go build
# accepts as arguments a list of one or more file paths
# Linux/Mac
./counter pg2009.txt pg2010.txt pg2011.txt
# Windows
.\counter.exe .\pg2009.txt .\pg2010.txt .\pg2011.txt
# input on stdin
# Linux/Mac
cat pg2009.txt | ./counter
# Windows
cat .\pg2009.txt | .\counter.exe
# accepts as arguments a list of one or more file paths
# Linux/Mac
go run main.go pg2009.txt pg2010.txt pg2011.txt
# Windows
go run .\main.go .\pg2009.txt .\pg2010.txt .\pg2011.txt
# input on stdin
# Linux/Mac
cat pg2009.txt | go run main.go
# Windows
cat .\pg2009.txt | go run .\main.go
The idea here is simple:
- check whether the files are from command line or from stdin, then process them;
- For all the words in the files, using regular expression to catch them, then convert them into lower case;
- Calculate the frequency of all three-word sequences;
- For each file in command line input, we use a go routine to process them. In order to protect the write conflict, I use a Mutex lock when we want to update the frequency.