summaryrefslogtreecommitdiffstats
path: root/build/update-locales.sh
diff options
context:
space:
mode:
authorwxiaoguang <wxiaoguang@gmail.com>2023-03-02 19:33:36 +0100
committerGitHub <noreply@github.com>2023-03-02 19:33:36 +0100
commitd72462dae62b6d76ddd47f6bbadbfe0352e03f89 (patch)
tree2e4dca6e4af5ad54a37477380b122ba0e6915b82 /build/update-locales.sh
parentRefactor `ctx` in templates (#23105) (diff)
downloadforgejo-d72462dae62b6d76ddd47f6bbadbfe0352e03f89.tar.xz
forgejo-d72462dae62b6d76ddd47f6bbadbfe0352e03f89.zip
Improve update-locales script and fix locale processing bug (#23240)
The locales of Gitea has been broken for long time, till now, it's still not fully fixed. One of the root problems is that the `ini` library is quite quirky and the `update-locales` script doesn't work well for all cases. This PR fixes the `update-locales` script to make it satisfy `ini` library and the crowdin. See the comments for more details. The `locale_zh-CN.ini` is an example, it comes from crowdin and is processed by the new `update-locales.sh`. Especially see the `feed_of`: https://github.com/go-gitea/gitea/pull/23240/files#diff-321f6ca4eae1096eba230e93c4740f9903708afe8d79cf2e57f4299786c4528bR268
Diffstat (limited to 'build/update-locales.sh')
-rwxr-xr-xbuild/update-locales.sh45
1 files changed, 40 insertions, 5 deletions
diff --git a/build/update-locales.sh b/build/update-locales.sh
index 046f48ee86..b7611c0c9a 100755
--- a/build/update-locales.sh
+++ b/build/update-locales.sh
@@ -1,14 +1,49 @@
-#!/bin/sh
+#!/bin/bash
+
+set -e
+
+SED=sed
+
+if [[ $OSTYPE == 'darwin'* ]]; then
+ # for macOS developers, use "brew install gnu-sed"
+ SED=gsed
+fi
+
+if [ ! -f ./options/locale/locale_en-US.ini ]; then
+ echo "please run this script in the root directory of the project"
+ exit 1
+fi
mv ./options/locale/locale_en-US.ini ./options/
-# Make sure to only change lines that have the translation enclosed between quotes
-sed -i -r -e '/^[a-zA-Z0-9_.-]+[ ]*=[ ]*".*"$/ {
- s/^([a-zA-Z0-9_.-]+)[ ]*="/\1=/
- s/\\"/"/g
+# the "ini" library for locale has many quirks
+# * `a="xx"` gets `xx` (no quote)
+# * `a=x\"y` gets `x\"y` (no unescaping)
+# * `a="x\"y"` gets `"x\"y"` (no unescaping, the quotes are still there)
+# * `a='x\"y'` gets `x\"y` (no unescaping, no quote)
+# * `a="foo` gets `"foo` (although the quote is not closed)
+# * 'a=`foo`' works like single-quote
+# crowdin needs the strings to be quoted correctly and doesn't like incomplete quotes
+# crowdin always outputs quoted strings if there are quotes in the strings.
+
+# this script helps to unquote the crowdin outputs for the quirky ini library
+# * find all `key="...\"..."` lines
+# * remove the leading quote
+# * remove the trailing quote
+# * unescape the quotes
+# * eg: key="...\"..." => key=..."...
+$SED -i -r -e '/^[-.A-Za-z0-9_]+[ ]*=[ ]*".*"$/ {
+ s/^([-.A-Za-z0-9_]+)[ ]*=[ ]*"/\1=/
s/"$//
+ s/\\"/"/g
}' ./options/locale/*.ini
+# * if the escaped line is incomplete like `key="...` or `key=..."`, quote it with backticks
+# * eg: key="... => key=`"...`
+# * eg: key=..." => key=`..."`
+$SED -i -r -e 's/^([-.A-Za-z0-9_]+)[ ]*=[ ]*(".*[^"])$/\1=`\2`/' ./options/locale/*.ini
+$SED -i -r -e 's/^([-.A-Za-z0-9_]+)[ ]*=[ ]*([^"].*")$/\1=`\2`/' ./options/locale/*.ini
+
# Remove translation under 25% of en_us
baselines=$(wc -l "./options/locale_en-US.ini" | cut -d" " -f1)
baselines=$((baselines / 4))