Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #191 #197

Merged
merged 40 commits into from
Jun 28, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
5398859
fix #191
nck-mlcnv Apr 8, 2023
58d71bb
simple fix for countLines method
nck-mlcnv Apr 8, 2023
23470d2
Merge branch 'develop' into fix/issue-191/readline-with-different-lin…
nck-mlcnv Apr 11, 2023
6798f8f
add IndexedLineReader with tests
nck-mlcnv Apr 12, 2023
6b8add0
change the IndexedLineReader to comply with current indexing implemen…
nck-mlcnv Apr 14, 2023
ae4bae6
update FileUtils
nck-mlcnv Apr 14, 2023
8bec05d
change getLineEnding method in FileUtils class
nck-mlcnv Apr 18, 2023
78ebd18
refactor IndexedLineReader
nck-mlcnv Apr 18, 2023
5848fe2
restructure FileUtilsTest
nck-mlcnv Apr 18, 2023
ebd623b
fix javadoc
nck-mlcnv Apr 18, 2023
343becc
fix javadoc again
nck-mlcnv Apr 21, 2023
5b52ab2
correct try-resource-blocks
nck-mlcnv Apr 21, 2023
b0ab294
add a test for IndexedLineReader
nck-mlcnv Apr 21, 2023
0077efb
correct try-resource-block for streams too
nck-mlcnv Apr 21, 2023
70a172e
make countLines method skip lines that only contain whitespace charac…
nck-mlcnv May 15, 2023
4398bdc
add docs for constructor of FileSeparatorQuerySource
nck-mlcnv May 15, 2023
41cf274
rename IndexedLineReader to IndexedQueryReader
nck-mlcnv May 17, 2023
1d8fbf6
change try-with-resource instructions
nck-mlcnv May 17, 2023
bd76573
refactor indexing
nck-mlcnv May 25, 2023
5ae9166
fix unit tests
nck-mlcnv May 25, 2023
23eb002
fix size method
nck-mlcnv May 25, 2023
7fed3c9
refactor default separator in FileSeparatorQuerySource
nck-mlcnv May 25, 2023
adc47b4
fix more tests
nck-mlcnv May 26, 2023
065e8fe
fix documentation
nck-mlcnv May 26, 2023
1157ce0
update constructor and readQuery
nck-mlcnv May 26, 2023
74cb9ff
remove unused methods and other minor changes
nck-mlcnv Jun 15, 2023
4de0a20
fix indexFile method and add more test cases
nck-mlcnv Jun 17, 2023
b7b6396
Fix/issue 191/rework parsing (#211)
bigerl Jun 21, 2023
8e1a35d
small change to test
nck-mlcnv Jun 23, 2023
1a86dcf
update documentation
nck-mlcnv Jun 23, 2023
5783fd5
fix QueryHandlerTest
nck-mlcnv Jun 23, 2023
90404f9
Merge branch 'develop' into fix/issue-191/readline-with-different-lin…
nck-mlcnv Jun 23, 2023
41002ac
adjust expected test results to new behaviour of the IndexedQueryReader
nck-mlcnv Jun 26, 2023
2191621
add tests for getLineEnding
nck-mlcnv Jun 26, 2023
cb57354
fix test
nck-mlcnv Jun 26, 2023
e82c165
fix test cases
nck-mlcnv Jun 26, 2023
eb88285
Refactor FileUtilsTest to use temporary files
bigerl Jun 28, 2023
18f7d33
Ensure temporary test files are deleted
bigerl Jun 28, 2023
ff7bbc0
Update FileUtilsTest to use temp file
bigerl Jun 28, 2023
b41fbd2
Refactor FileUtilsTest for safer and more dynamic test data
bigerl Jun 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
* Methods to work easier with Files.
Expand All @@ -14,74 +16,79 @@
public class FileUtils {

/**
* Counts the lines in a file efficiently. Props goes to:
* <a href="http://stackoverflow.com/a/453067/2917596">http://stackoverflow.com/a/453067/2917596</a>
* Counts the lines in a file efficiently. (only if the line ending is "\n") <br/>
* Source: <a href="http://stackoverflow.com/a/453067/2917596">http://stackoverflow.com/a/453067/2917596</a>
*
* @param filename File to count lines of
* @return No. of lines in File
* @param filename file to count lines of
* @return number of lines in the given file
* @throws IOException
*/
public static int countLines(File filename) throws IOException {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
try (InputStream is = new BufferedInputStream(new FileInputStream(filename))) {

byte[] c = new byte[1024];
int count = 0;
int readChars;
boolean empty = true;
byte lastChar = '\n';
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
// Check if line was empty
if (lastChar != '\n') {
++count;
if(getLineEnding((filename.getAbsolutePath())).equals("\n")) {
final int BUFFER_SIZE = 8192;
try (InputStream is = new BufferedInputStream(new FileInputStream(filename), BUFFER_SIZE)) {
byte[] c = new byte[BUFFER_SIZE];
int count = 0;
int readChars = 0;
boolean empty = true;
byte lastChar = '\n';
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
// Check if line was empty
if (lastChar != '\n') {
++count;
}
} else {
empty = false;
}
} else {
empty = false;
lastChar = c[i];
}
lastChar = c[i];
}
if (lastChar != '\n') {
count++;
}
return (count == 0 && !empty) ? 1 : count;
}
if (lastChar != '\n') {
count++;
}
else {
String line = "";
int count = 0;
try(BufferedReader br = new BufferedReader(new FileReader(filename))) {
while ((line = br.readLine()) != null) {
if (!line.isEmpty()) {
count++;
}
}
}
return (count == 0 && !empty) ? 1 : count;
return count;
}
}

/**
* Returns a line at a given position of a File
*
* @param pos line which should be returned
* @param filename File in which the queries are stated
* @return line at pos
* Returns a line at a given position of a File. <br/>
* This method ignores every empty line, therefore the parameter <code>pos</code> references the n-th non-empty line.
*
* @param index the position of a non-empty line which should be returned
* @param file the file to read from
* @return the line at the given position
* @throws IOException
*/
public static String readLineAt(int pos, File filename) throws IOException {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
try (InputStream is = new BufferedInputStream(new FileInputStream(filename))) {
StringBuilder line = new StringBuilder();
public static String readLineAt(int index, File file) throws IOException {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
String line = "";
int count = 0;

byte[] c = new byte[1024];
int count = 0;
int readChars;
byte lastChar = '\n';
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
// Check if line was empty
if (lastChar != '\n') {
++count;
}
} else if (count == pos) {
// Now the line
line.append((char) c[i]);
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
while ((line = br.readLine()) != null) {
if (!line.isEmpty()) {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
if (count == index) {
return line;
}
lastChar = c[i];
count++;
}
}

return line.toString();
}
}
return "";
}

public static int getHashcodeFromFileContent(String filepath) {
Expand All @@ -100,6 +107,64 @@ public static String readFile(String path) throws IOException {
return new String(encoded, StandardCharsets.UTF_8);
}

/**
* This method detects and returns the line-ending used in a file. <br/>
* It reads the whole first line until it detects one of the following line-endings:
* <ul>
* <li>\r\n - Windows</li>
* <li>\n - Linux</li>
* <li>\r - old macOS</li>
* </ul>
*
* If the file doesn't contain a line ending, it defaults to <code>System.lineSeparator()</code>.
*
* @param filepath this string that contains the path of the file
* @return the line ending used in the given file
* @throws IOException
*/
public static String getLineEnding(String filepath) throws IOException {
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
int lineEndingIndex = 0;
try(BufferedReader br = new BufferedReader(new FileReader(filepath))) {
// readline consumes the line endings mentioned in the javadoc, thus the length of a line equals the index
// of the line's ending
lineEndingIndex = br.readLine().length();
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
}

// assumes that line endings can have a maximum of 2 characters
byte[] buffer = new byte[2];
int numberOfreadChars = 0;
try(InputStream is = new BufferedInputStream(new FileInputStream(filepath))) {
is.skip(lineEndingIndex);
numberOfreadChars = is.read(buffer);
}

// in the case, that the file contains no line ending
if(numberOfreadChars == 0) {
return System.lineSeparator();
}

// converts the buffer to a string
String bufferString = "";
for(int i = 0; i < numberOfreadChars; i++){
bufferString += (char) buffer[i];
}

// The regex pattern "\R" searches for every type of line ending, the result of the pattern matching is the
// result of this method.
// The pattern matching is done here, in case that the line ending has only one character. In that case
// bufferString can still contain 2 characters (i.e. the line ending is "\n" and after the first line there is
// a second, non-empty line, this results in bufferString equaling "\n" + the first character of the second
// line)
Pattern pattern = Pattern.compile("\\R");
Matcher matcher = pattern.matcher(bufferString);
if(matcher.find()) {
return matcher.group();
}
else {
// if for some reason, the matcher still doesn't find a line ending
return System.lineSeparator();
}

nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
public static BufferedReader getBufferedReader(File queryFile) throws FileNotFoundException {
return new BufferedReader(new FileReader(queryFile));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@

import org.junit.Test;

import java.io.File;
import java.io.IOException;
import java.io.*;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
Expand All @@ -15,7 +14,30 @@ public void countLinesTest() throws IOException {
//get test file
File f = new File("src/test/resources/fileUtils.txt");
//count lines

long startTime = 0;
long endTime = 0;

startTime = System.nanoTime();
assertEquals(6, FileUtils.countLines(f));
endTime = System.nanoTime();
System.out.println(((double) (endTime - startTime) / 1000000) + "ms");

File f1 = new File("src/test/resources/readLineTestFile1.txt");
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
File f2 = new File("src/test/resources/readLineTestFile2.txt");
File f3 = new File("src/test/resources/readLineTestFile3.txt");
startTime = System.nanoTime();
assertEquals(4, FileUtils.countLines(f1));
endTime = System.nanoTime();
System.out.println(((double) (endTime - startTime) / 1000000) + "ms");
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
startTime = System.nanoTime();
assertEquals(4, FileUtils.countLines(f2));
endTime = System.nanoTime();
System.out.println(((double) (endTime - startTime) / 1000000) + "ms");
startTime = System.nanoTime();
assertEquals(4, FileUtils.countLines(f3));
endTime = System.nanoTime();
System.out.println(((double) (endTime - startTime) / 1000000) + "ms");
}

@Test
Expand All @@ -29,6 +51,22 @@ public void readLineAtTest() throws IOException {
assertEquals("dfe", FileUtils.readLineAt(4, f));
//read line at -1
assertEquals("", FileUtils.readLineAt(-1, f));
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved

File f1 = new File("src/test/resources/readLineTestFile1.txt");
File f2 = new File("src/test/resources/readLineTestFile2.txt");
nck-mlcnv marked this conversation as resolved.
Show resolved Hide resolved
File f3 = new File("src/test/resources/readLineTestFile3.txt");
assertEquals("line 1", FileUtils.readLineAt(0, f1));
assertEquals("line 1", FileUtils.readLineAt(0, f2));
assertEquals("line 1", FileUtils.readLineAt(0, f3));
assertEquals("line 2", FileUtils.readLineAt(1, f1));
assertEquals("line 2", FileUtils.readLineAt(1, f2));
assertEquals("line 2", FileUtils.readLineAt(1, f3));
assertEquals("line 3", FileUtils.readLineAt(2, f1));
assertEquals("line 3", FileUtils.readLineAt(2, f2));
assertEquals("line 3", FileUtils.readLineAt(2, f3));
assertEquals("line 4", FileUtils.readLineAt(3, f1));
assertEquals("line 4", FileUtils.readLineAt(3, f2));
assertEquals("line 4", FileUtils.readLineAt(3, f3));
}

@Test
Expand Down
Loading