-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging to help troubleshoot OOM #41
Comments
The Anonimatron code is not multi-threaded. Records are processed table by table and record by record, see JdbcAnonimizerService line 61 and further The current table being processed is stored in a log4j NDC before processing of that table starts. Maybe you can adjust your log4j configuration so that it outputs this NDC. The included log4j configuration of anonimatron should provide a starting point. I am curious what amount or kind of data your are anonymizing that is causing this problem, usually this points to a configuration problem where the source data is always unique (like a record id) and large. |
Thank you very much for the prompt response. Looking at htop it seems from the getgo the heap continues to grow and won't back down. I am going to play around with the java command. Has there been any known memory leaks in version 1.7.? I am trying to upgrade to as latest as possible on my end, sorry. |
I am not aware of any current memory issues in Anonimatron, it is being used in large production systems where it completes long anonimization runs (multiple minutes) without a problem. Of course you can be running into a situation we have not encountered before, so I think we need to keep all options open. Can you maybe share an anonimized version of your config file (please remove passwords and other data you don't want to share online)? |
Thank you @realrolfje. Sorry for the late response. We are just able to produce a new anonymized backup yesterday, although we had to use r5 instance, roughly about 102G at peak time, 87G for the remaining of the process. So there must be some tables that's really huge and eating up all our memory (I set heap 30G/102G min/max out of 120G). But I still want to thank you for this amazing project. We are currently verifying the produced backup.
Thanks. |
Anonimatron version:
Operating system and version:
Java runtime (
java -version
):Java 7
Executed commands or actions:
Expected outcome or behavior:
Does not OOM.
Actual outcome or behavior:
This is what we got...
Is there a way in anonimatron to log which table/database was processing by the thread that went OOM?
Thanks.
The text was updated successfully, but these errors were encountered: