Wikifunctions:Catalogue/String operations
Appearance

String evaluation operations
These functions perform simple tests on a text string to tell you if something else needs to be done, or they already are in an expected format.
- string length (Z11040): Return the length of this string
- string length in UTF-8 code units (Z17036): Return the length of this string in UTF-8 code units
- string length in UTF-16 code units (Z17030): Return the length of this string in UTF-16 code units
- is empty string (Z10008): true if the input string is strictly empty, even without any non-printing characters, and false otherwise
- is string blank (Z10083): Checks if a string just contains whitespaces
- is numeric (Z10715): Checks if a string contains only numeric characters
- is uppercase (Z10336): checks if string is uppercase (equal to its own uppercase - so blanks and empty strings count)
- has and is uppercase (Z11349): checks if string has and is uppercase (blanks and strings without letters don't count)
- is lowercase (Z10346): checks if string is all lowercase
- has and is lowercase (Z11383): checks if string has and is lowercase (blanks and strings without letters don't count)
- is title case (Z10375): checks if string is in title case
- is pascal case (Z10363): checks if string is in pascal case
- is camel case (Z10897): true if the entered string is in camel case (e.g. 'camelCase')
- is snake case (Z10324): checks if string is in snake case
- fallback if string is empty (Z11082): returns a fallback string if the value is empty, and the value itself if not
- has specified chars paired (Z11678): check if a string has correctly paired chars (for example: brackets). Specifying left and right chars (brackets) in sequence)
- has all brackets paired (Z11684): check for pairing of all possible left-right paired characters, feel free to extend
- is pangram (Latin alphabet) (Z12626): checks whether a string of characters possesses every letter from the Latin alphabet at least once
- is a palindrome (Z10096): test if a string is the same when read forward and backward (see Z10553 for one with Unicode grapheme support)
- has double letter (Z19170): tests whether the string has any letter (case sensitive) used twice in a row
- is square-free (Z19191): Combinatorial term. A word that avoids the pattern XX where X is any non-empty sequence of letters
- Is it a valid ISO 6709 code (Z19217): checks if a string matches ISO 6709.
String comparison operations
- string equality (Z866): True if the first string and the second string are the same
- case-insensitive string equality (Z10539): returns true if both strings are the same if converted to lowercase
- string inequality (Z10379): true if two text strings are not exactly equal
- has substring (Z10070): Check if a substring exists within another string
- count substrings (Z14450): returns the number of times a substring occurs in a string
- string starts with (Z10615): returns true if the substring exists at the beginning of the string
- string ends with (Z10618): true if the substring exists at the end of the string
- strings equal length (Z11690): checks if the two input strings are of equal length
- longer string (Z11519): Returns the longer of two strings. If equal, defaults to returning the first.
- string only has characters from alphabet (Z11693): check if all of the characters in the tested string are from the alphabet string
- common codepoints in strings (Z14483): true if the two strings contain any codepoints (~characters) in common
- is pangram of alphabet (Z13119): check if the string uses every letter of a specified (lowercase) alphabet
- is anagram (simple) (Z10973): test if the same characters at the same number of times are used in two strings (characters must be exact code points).
- is heterogram (Z11573): True if no character occurs more than once
- is ISO 639-1 language code (Z13482): validates whether a string is a valid ISO 639-1 language code
- is ISO 639-2 language code (Z14083): validates whether a string is a valid ISO 639-2 language code
- is subword of string (Z19177): the subword is contained (in order) in the string, but may be interspersed with other letters
- hamming distance between two strings (Z11328): the strings should be of equal length, otherwise will return -1
- Levenshtein Distance (Z10393): gibt eine Zahl als Ergebnis aus
String discard functions
- discard from start of first substring (Z11410): if the substring is found in the full string, discard everything after and including the first occurrence, otherwise leave unchanged
- discard from end of first substring (Z11412): if the substring is found in the full string, discard everything after but not including the first occurrence, otherwise leave unchanged
- discard until start of first substring (Z11418): if the substring is found in the full string, discard everything before but not including the first occurrence, otherwise leave unchanged
- discard until end of first substring (Z11420): if the substring is found in the full string, discard everything before and including the first occurrence, otherwise leave unchanged
- discard from start of last substring (Z11414): if the substring is found in the full string, discard everything after and including the last occurrence, otherwise leave unchanged
- discard from end of last substring (Z11416): if the substring is found in the full string, discard everything after but not including the last occurrence, otherwise leave unchanged
- discard until start of last substring (Z11422): if the substring is found in the full string, discard everything before but not including the last occurrence, otherwise leave unchanged
- discard until end of last substring (Z11424): if the substring is found in the full string, discard everything before and including the last occurrence, otherwise leave unchanged
- remove at end (Z11170): if a string ends with the given suffix, remove the suffix, otherwise return the string unchanged
String character discard functions
These reduce a string by discarding certain characters.
- remove regular spaces (Z10052): remove all regular spaces (U+0020) from a string
- trim string (Z10079): Remove starting and ending whitespaces
- remove characters in character range (Z11531): strips all characters from a codepoint block from a string
- remove characters in unicode range (Z14119): strips all characters from a codepoint block (specified by unicodes) from a string
- remove interpunction (Z11193): remove all interpunction characters
- remove first character (Z14456): removes the first character of a string and returns the rest
- remove last character (Z11879): renvoie la chaîne de caractères sans le dernier caractère
- final N characters of string (Z14460): return only the last N characters of the initial string
- remove first N characters of string (Z14636): return the string with the first N characters removed
- first N characters of string (Z14592): returns a substring from the beginning of a specified string up to a number of characters
- character Nth from the end of the string (Z14463): return is a string type
- remove all characters except Arabic numerals (Z14494): no description
- remove all characters except ASCII digits, uppercase Latin letters and lowercase Latin letters (Z10171): 文字列から半角英数字以外の文字を除去する
- remove all characters not in second string (Z14515): leaves only the characters in string 1 that are also in string 2
- remove all characters in second string (Z14520): leaves only the characters in string 1 that are not in string 2
- remove repeated characters (Z19185): remove repeat occurrences of any character in the string, just leave the first one
Simple String transformations
These perform character replacements and other basic operations.
Replace suffix
- replace at end (Z11178): replaces suffix with replacement if input ends with suffix; if not, returns input unchanged
- replace suffix "a" with "ors" (Z18092): no description
- replace suffix "a" with "ons" (Z18026): no description
- replace suffix "a" with "on" (Z17827): E.g. öga -> ögon
- replace suffix "a" with "orna" (Z17915): E.g. gata -> gatorna
- replace suffix "a" with "ornas" (Z17918): E.g. gata -> gatornas
Add string suffix if not already present
- add suffix to string if it does not already end with the suffix (Z17973): E.g. testing + ing -> testing
- add suffix "ns" to string if it does not end with "ns" (Z18066): for Swedish
- add suffix "en" to string if it does not end with "en" (Z18050): no description
- add suffix "ets" to string if it does not end with "ets" (Z18042): no description
- add suffix "ens" to string if it does not end with "ens" (Z18039): no description
- add suffix "enas" to string if it does not end with "enas" (Z18036): E.g. huvud -> huvudenas
- add suffix "s" to string if it does not already end with "s" (Z18020): E.g. test -> tests
- add suffix "ts" to string if it does not already ends with "ts" (Z18017): no description
- add suffix "nas" to string if it does not already end with "nas" (Z17952): no description
- add suffix "rnas" to string if it does not end with "rnas" (Z17942): E.g. fiende -> fiendernas
- add suffix "rna" to string if it does not end with "rna" (Z17939): E.g. fiende -> fienderna
- add suffix "na" to string if it does not end with "na" (Z17946): no description
- add suffix "r" to string if it does not end with "r" (Z17749): E.g. fiende -> fiender
- add suffix "t" to string if it does not end with "t" (Z17904): E.g. äpple -> äpplet
- add suffix "a" to string if it does not end in "a" (Z17948): no description
- add suffix "n" to string if it does not already end with "n" (Z17791): E.g. äpple -> äpplen
Other transformations
- join two strings (Z10000): combine two strings, one after the other
- concatenate many strings (Z21394): no description
- join list of strings (Z12899): returns string composed of list elements separated by a given delimiter
- reverse string (Z10012): Inverts the order of the characters in a String (see Z10548 for one with Unicode grapheme support)
- replace all substrings (Z10075): finds and replaces all instances of a substring in an input string
- replace character set (Z14613): replaces each character of the first string that appears in the second string with the corresponding character in the third string
- wrap string (Z11145): add wrapper text to the start and end of a string
- unwrap string (Z11151): removes text from start and end of string once only if both are present
- duplicate string (Z10753): takes a string and returns it duplicated
- Replicate string n-times (Z12624): Replicates a string n times: (e.g. f("a",5) -> "aaaaa")
- duplicate string N times (Z10911): 入力された文字列をN回複製して、結合した形で出力する
- regular expression substitute with flags (Z12316): $N for capture groups. Flags supported should at least be 'i', 'm', and 'g'.
- replace all (regex, case sensitive) (Z10193): replace characters in a string with another string according to a regex pattern
- echo string except for specific replacement (Z18898): returns the same string, unless it matches a specific string when it returns a specific string
- to Title Case (Z10251): converts a string to title case
- to PascalCase (Z10290): convert string to Pascal Case
- to camelCase (Z10816): convert string to lower camelCase,
- to snake_case (Z10281): convert string to snake case
- string to hex (UTF-8) (Z10366): convert string of UTF-8 characters into hexadecimal
- hex (string) to string (UTF-8) (Z10373): hex to string
- URI percent encode (Z10761): encodes certain characters using URI percent encoding syntax
- URI percent decode (Z10774): decodes a percent-encoded input string
- international morse code encode (Z10944): encodes the supplied string in morse code, separating letter encodings by spaces and words by " / "
- international morse code decode (Z10956): decodes the supplied string in morse code: separate letter encodings by spaces and words by " / "
- encode NATO phonetic alphabet code (Z10309): requires ALLCAPS input, e.g. EXAMPLE
- decode NATO phonetic alphabet code (Z10970): case insensitive
- Infix to Postfix (Z13060): converts infix operators and operands to postfix format
Color operations
- Mix colours (Z12997): Calculates the midpoint between two colours. It prefers input in hexadecimal but also accepts basic colour names.
- convert hex color (Z13017): converts a hexadecimal color code into HSL, HSV, RGB, and CMYK formats
- convert hex colour to [R,G,B] (Z17664): output is a list of three natural numbers, each between 0 and 255
- convert [R,G,B] to hex colour (Z17687): input is triplets of natural numbers between 0 and 255. output is lowercase preceded by #
- convert X11 colour to hex (Z17713): converts colour names to hex (including leading #) https://www.w3.org/TR/css-color-3/#svg-color
- opposite colour (Z13023): in the RGB color space
- colour contrast ratio (Z13028): returns colour contrast ratio 'X:1' for given hex colours
- Tint of colour (Z18184): It will mix a color with white by a given percentage.
- Shade of colour (Z18189): Returns the shade of a colour by mixing it with a percentage of black.
- Tone of colour (Z18196): Returns the tone of a color by mixing it with gray
- Analogous colour (Z18204): Returns the colours which are 30 degrees apart from the input base colour.
- Tetradic colours (square) (Z18208): Returns colours that are 90 degree apart from the input base colour.
- Triadic colours (Z18212): Returns the two colours that are 120 degrees and 240 degrees apart from the input base colour.
- Saturation of colour (Z18263): Returns the intensity of an colour. 100% saturation means there is no addition of gray.
- Lightness of colour (Z18268): Returns the measure of how light or dark a colour is, with 0% being completely black and 100% being completely white
- Subtractive colour (Z18296): Subtract the second colour from the first colour.
- Additive colours (Z18300): Additively mix two hex colours using the RGB model.
String presentation transformations
- to uppercase (Z10018): Convert a string to uppercase letters
- to lowercase (Z10047): Convert a string to lowercase letters
- turn to superscript (Z19612): Takes a text, and all characters that have a superscript version are replaced with such.
- pretty " (Z11484): replace " with pretty left-right quotes depending on position
- pretty ' (Z11490): replace ' with pretty left-right quotes depending on position
- format large natural number strings by adding commas (Z13473): অনেকগুলো অংক রয়েছে এমন স্বাভাবিক সংখ্যায় কমা যোগ করে সাজায়।
- pad string with leading characters to specified length (Z14770): add specified characters at the start until the string is of the required length
Uncommon String operations
These functions perform more advanced transformations, hold more states and showcase the more advanced capabilities of Wikifunctions.
- left/inner/right mark replacement (Z11492): replaces the same mark (or substring) in a string with different replacements depending on position
- general positional mark replacement (Z11501): a generalisation of Z11492 to allow different spacers and specify isolated replacement
String classical cipher functions
(alphabet needs to be specified when calling these functions)
- Caesar cipher (Latin alphabet) (Z12812): rotates letters in the Latin alphabet forward by a defined number of places
- ROT1 (Latin alphabet) (Z10846): move by one letter in the English alphabet
- ROT13 (Latin alphabet) (Z10627): encode or decode a Latin alphabet string using the ROT13 cipher ROT13 encrypt/decrypt
- ROT25 (Latin alphabet) (Z10851): move each letter one letter back in the English alphabet
Cryptographic hash functions
(would be better with types representing a stream of bytes)
- SHA-1 (Z10148): SHA-1 hash of the UTF-8 representation of a string, as a lowercase hexadecimal byte string SHA-1
- SHA-256 (Z10124): returns the hexadecimal hash of a string in SHA-256 SHA-256
- SHA-384 (Z10132): returns the hexadecimal hash of a string in SHA-384 SHA-384
- SHA-512 (Z10067): hash a string using the SHA-512 function SHA-512
Experimental String operations
TODO: Explain why these exist and when people might use them.
- (!) get lemma string from Lexeme JSON (Z10037): (!) approximates mw.wikibase.lexeme.entity.lexeme:getLemma(languageCode)
- Turkish final-obstruent devoicing for string (Z10022): トルコ語の文字列において、末尾の有声阻害音を無声化した綴りを返す
- Base16 Encode (Z11003): Encode a string into base16
- Base16 Decode (Z11007): Decode a string from base16
- Base32 Encode (Z14189): no description
- Base32 Decode (Z14195): Decode a string from Base32
- Base64 Encode (Z10057): Encode a string into base64
- Base64 decode (Z10062): Decode a string from base64 (needed to demonstrate base64 encode/decode examples)
- debug (Z12941): prints the non-empty string passed to it as a debug and returns true, if empty returns false
Wikitext and Mediawiki string operations
- italicise a simple string in Wikitext (Z11019): wrap string with two pairs of single quotes (ABC -> ''ABC''). Careful using this if your text has special formatting characters.
- bold in Wikitext (Z11139): bold a string by triple quoting, e.g. (ABC -> '''ABC'''). Careful if there are special characters.
- csv record to wikitable row (Z10919): Converts a validly formatted (RFC 4180) comma-separated value series into the contents of a valid wikitable row (not including the row start or row end characters) where variables are separated by '||', and any whitespace is unchanged. Be careful to validly render CSV with quoted fields and with pipes ('|') in the field.
- wrap with XML tag (Z11156): adds <tag> and </tag> around a string
- substitute mediawiki editchangetags query (Z17954): 36621225 ... ↓ ?action=editchangetags&ids%5B36621225%5D=1 ... &ids
- substitute mediawiki revisiondelete query (Z17956): 36621225 ... ↓ ?action=revisiondelete&ids%5B36621225%5D=1 ... &ids
Comma-separated operations
- string is element of CSV (Z11094): tests whether a string is an element of a validly formatted (RFC 4180) comma-separated value series (single row, not whole file); be careful to validly interpret a CSV with quoted fields