add tags to old articles
This commit is contained in:
parent
9d1edbc90d
commit
4f9b7101c6
@ -1,7 +1,7 @@
|
||||
Title: Cached Neocities Uploads
|
||||
Brief: Making uploading of directories to Neocities less painful.
|
||||
Date: 1707585916
|
||||
Tags: Programming, Bash
|
||||
Tags: Programming, Bash, Script
|
||||
CSS: /style.css
|
||||
|
||||
Quick and dirty Bash-based sha256sum checksum solution to create stamps for later checking and rejection.
|
||||
|
@ -1,7 +1,7 @@
|
||||
Title: Slim Summer Elf
|
||||
Brief: Making of minimal x86 (Linux) ELF executable.
|
||||
Date: 1684666702
|
||||
Tags: Programming, Linux, C
|
||||
Tags: Programming, Linux, C, Bash, Linker, Low-level
|
||||
CSS: /style.css
|
||||
|
||||
Code below was composed for [4mb-jam](https://itch.io/jam/4mb-jam-2023) which I didn't finish.
|
||||
|
36
articles/toponym-extractor/page.mmd
Normal file
36
articles/toponym-extractor/page.mmd
Normal file
@ -0,0 +1,36 @@
|
||||
Title: Geonames Toponym Extractor Utility
|
||||
Brief: Simple script for extracting ASCII toponym fields from geonames datasets
|
||||
Date: 1713683410
|
||||
Tags: Python, Script, Programming
|
||||
CSS: /style.css
|
||||
|
||||
[Link to code](https://codeberg.org/veclavtalica/geonames-extractor)
|
||||
|
||||
Small script I used for extracting data for machine learning endeavors.
|
||||
|
||||
Usage:
|
||||
```
|
||||
dataset feature_class [feature_code] [--dirty] [--filter=mask]
|
||||
```
|
||||
|
||||
From this invokation ...
|
||||
```
|
||||
./extractor.py datasets/UA.txt P PPL --filter=0123456789\"\'-\` > UA-prep.txt
|
||||
```
|
||||
|
||||
... it produces a newline separated list of relevant toponyms of particular kind, such as:
|
||||
```
|
||||
Katerynivka
|
||||
Vaniushkyne
|
||||
Svistuny
|
||||
Sopych
|
||||
Shilova Balka
|
||||
```
|
||||
|
||||
`--filter=` option is there so that aplhabet size could be reduced for learning purposes,
|
||||
as there are usually quite a lot of symbols that are only found few times,
|
||||
which produces poor balancing.
|
||||
|
||||
`--dirty` option reduces cases such as `Maydan (Ispas)` and `CHAYKA-Transmitter, Ring Mast 4` to `Maydan` and `CHAYKA-Transmitter`.
|
||||
|
||||
Duplicates are also removed.
|
Loading…
Reference in New Issue
Block a user