```
,-----.
#,-. ,-.#
() a e ()
( (_) )
#\_ - _/#
,' `"""` `.
,' \X/ `.
/ X ____\
/ v ,` v `,
/ / ( <==+==> )
`-._/|__________\ ^ /
(\\) |______@____\ ^ /
\\ | ( ) \ ^ /
) | \^/
( | |v
<(^)>| |
v | |
| |
|_.--.__ .--._|
`===' `==='
```
Thoughts and code about Information, Chaos and everything in between.
- What is it ?
- What is it made of ?
- How can it be found (and turned into knowledge?)
- How can it be used optimally to answer questions?
- What can't be known? - Is there unatainable Information ?
This is a repo tries to gradually find more sophisticated answers to those questions, either through code or rambling style posts. Feel free to contact me if you're interested
This section tries to focus on two fundamental questions:
- What happens when we lack information?
- How should we deal, when faced with chaotic lack of information?
Question:
How can I visualise ideas and how can I determine connections between different ideas?
For this approach I tried to turn philosophical ideas into knowledge graphs.
Thereby two different approaches have been used to identify entities and relationships.
- Using Rule based Parsing Systems
- Using an LLM (Llama3) to extract entities and relationships via prompting
- Using BERT to extract entities and relationships directly
Eventually, the results have been visualised using pyvis
. As the input size increased, this approach of simply
visualising all entities and their relationships became unfeasible.
Hence, this project is on hold until I've solved the question of "What is the most essential information?".
Question:
How can I understand what (sub)topics are central in a given document corpus or more specifically (research) area?
Fishing for understanding in a personally new field of understanding can easily become an orientation-less wandering through a dark forest of (pseudo) knowledge.
One might need a navigation system find the central intellectual building blocks of this new field of interest.
The aim of this project is, to build exactly this navigation system by developing a tool that automatically identifies central ideas and topics in a given field.
Theoretical Overview:
For a general understanding, a comprehensive list of modern topic modelling techniques can be found here:
Modern Topic Modelling Approaches
Question:
How can i find the most relevant documents for a given endeavour, in a large pool of documents?
For this quest, I have manually implemented, trained and evaluated the performance of two prominent neural re-ranking algorithms (K-NRM and TK )
Code and results can be found here: Re-Rankers
Question:
How can I extend 3. with arbitrary (OOV) queries?
Code and results can be found here:
Q&A
During the projects listed above, I worked with lossless compression-algorithms to reduce data sizes (and to identify symbolic Morphemes) and thereby implemented the following algorithms:
-
- Turned out as a pretty good start by losslessly compressing Kants Critique of pure reason by 55%
-
Neural Compressors:
- Variable Rate Semantic Compression