Support for Data Quality Functions
Data Quality functions are contributed by the OSDQ Project. The functions are prefixed with osdq.
, but may be called without the prefix.
osdq.random
This function returns the randomized string. For example, sourceValue
may randomize to eusolercaVu
.
random(sourceValue)
sourceValue
is the string that needs to be randomized.
osdq.digit
This function returns the digit characters of the string. For example, a1 b2 c3 d4
will become 1234
.
digit(sourceValue)
sourceValue
is the string from wich digits need to be extracted.
osdq.whitespaceIndex
This function returns the index of the first whitespace, For example, source Value
will return 6
.
whitespaceIndex(sourceValue)
sourceValue
is the string in which the whitespace index needs to be found.
osdq.validCreditCard
This function returns TRUE
if the string matches the credit card number logic and checksum and is a valid Credit Card number.
validCreditCard(cc)
cc
is the Credit Card number string that needs to be checked.
osdq.validSSN
This function returns TRUE
if the string matches SSN number logic and is a valid SSN number.
validSSN(ssn)
ssn
is the SSN number string that needs to be checked.
osdq.validPhone
This function returns true
if the string matches phone logic: more than 8 characters, less than 12 characters, can’t start with 000.
validPhone(phone)
phone
is the phone number string needs to be checked.
osdq.validEmail
This function returns true
if the string is a valid email address.
validEmail(email)
email
is the email adress string that needs to be checked.
osdq.cosineDistance
This function returns the float distance between two strings based on the cosine similarity algorithm.
cosineDistance(a, b)
a
and b
are the strings the distance between which should be calculated.
osdq.jaccardDistance
This function returns the float distance between two strings based on the Jaccard similarity algorithm.
jaccardDistance(a, b)
a
and b
are the strings the distance between which should be calculated.
osdq.jaroWinklerDistance
This function returns the float distance between two strings based on the Jaro-Winkler algorithm.
jaroWinklerDistance(a, b)
a
and b
are the strings the distance between which should be calculated.
osdq.levenshteinDistance
This function returns the float distance between twostrings based on the Levenshtein algorithm.
levenshteinDistance(a, b)
a
and b
are the strings the distance between which should be calculated.
Data Quality functions available since v4.4