Details
-
Type: Bug
-
Status: Resolved (View Workflow)
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 1.8.0
-
Component/s: filters
-
Labels:None
Description
The issues can be classified in two categories.
- Greek accent not filtered out
- Non-diacritic character filtered out that shouldn't be
In #1 the accent is U+0345 ͅ COMBINING GREEK YPOGEGRAMMENI
In #2 the character is U+2019 ’ RIGHT SINGLE QUOTATION MARK
The latter is NOT a Greek accent. AFAIK, there's no valid reason to filter this out.
For reporting details, please refer to the recent discussion in sword-devel.
For detailed background, please refer to Greek diacritics
In addition to these two particular issues, there's the greater concern about the filter not having a restricted scope. Because some Greek accents are not particular to Greek but general combining characters for other languages too, when the filter is applied to non-Greek text, it removes diacritics that should be retained in that context.
Furthermore, because the filter makes use of Unicode Normalization to NFKC as a prelude to removing the combining characters, it has the side-effect that some unusual codepoints are not restored afterwards on account of the fact that decomposition for some codepoints is not reversible.
Example: U+00BE VULGAR FRACTION THREE QUARTERS ¾ becomes 3/4
NB. "Affects Versions" seems to be out of date in the CrossWire Tracker.
My observations were made using Xiphos 4.0.4 and diatheke version 4.7