Quando is a configurable date parser which picks up where Date.strptime stops. It was made to work with non-standard, multi-language dates (that is, dates recorded by humans in languages other than English) but can be used for almost any date format.
A typical use case for Quando is dealing with input like:
"01 января 2019 г."
"1-ЯНВ-19"
"01.01.19"
"1/Jan/2019"
"yanvar'19"
"ЯНВ"
This is a real-life example of how people would routinely write January 1, 2019 in Russia, but since many countries have their own words for month names, it might be a common problem.
gem install quandoand then
require 'quando'
Quando.configure do |c|
# Define regular expressions to identify possible month names:
c.jan = /january|jan|yanvar|январь|января|янв/i # simplified for readability
# c.feb = …
# …more configuration…
# Then combine them into regexps that will match the date formats you accept:
c.formats = [
/#{c.day} #{c.month_txt} #{c.year} г\./i, # matches "01 января 2019 г."
/#{c.day}\.#{c.month_num}\.#{c.year2}/i, # matches "01.01.19"
/#{c.month_txt}'#{c.year2}/i, # matches "январь'19"
/#{c.month_txt}/i, # matches "ЯНВ"
]
end
Quando.parse("01 января 2019 г.") #=> #<Date: 2019-01-01>
Quando.parse("01.01.19") #=> #<Date: 2019-01-01>
Quando.parse("январь'19") #=> #<Date: 2019-01-01>
Quando.parse("ЯНВ") #=> #<Date: 2019-01-01> (given that current year is 2019)Configuration properties can be set by submitting a block to the Quando.configure method, as seen in the example above, or by calling the setter methods on the configuration object directly:
Quando.config.jun = /qershor|mehefin/ # Albanian and Welsh month names
Quando.config.jul = /korrik|gorffennaf/ # will make you cryIf you need to use grouping, remember that non-capturing groups (?:abc) provide better performance.
If, for some reason, you need to use named groups (?<name>abc), avoid names day, month and year. Quando uses them internally, so conflicts are possible.
To let Quando recognize months in your language you need to define corresponding regular expressions for all months:
Quando.configure do |c|
# In Finland, your matchers might look like this:
c.jan = /jan(?:uary)? | tammikuu(?:ta)? /xi
c.feb = /feb(?:ruary)? | helmikuu(?:ta)? /xi
c.mar = /mar(?:ch)? | maaliskuu(?:ta)?/xi
c.apr = /apr(?:il)? | huhtikuu(?:ta)? /xi
c.may = /may | toukokuu(?:ta)? /xi
c.jun = /june? | kesäkuu(?:ta)? /xi
c.jul = /july? | heinäkuu(?:ta)? /xi
c.aug = /aug(?:ust)? | elokuu(?:ta)? /xi
c.sep = /sep(?:tember)? | syyskuu(?:ta)? /xi
c.oct = /oct(?:ober)? | lokakuu(?:ta)? /xi
c.nov = /nov(?:ember)? | marraskuu(?:ta)?/xi
c.dec = /dec(?:ember)? | joulukuu(?:ta)? /xi
# …more configuration…
endQuando comes with defaults that will probably work in most situations:
Quando.config.day matches numbers from 1 to 31, both zero-padded and unpadded;
Quando.config.month_num matches numbers from 1 to 12, both zero-padded and unpadded;
Quando.config.year matches any 4-digit sequence;
Quando.config.year2 matches any 2-digit sequence.
If you need to adjust these matchers make sure that they produce named captures day, month and year, respectively:
Quando.config.day = /(?<day> …)/
Quando.config.month_num = /(?<month> …)/
Quando.config.year = /(?<year> …)/By default, Quando.config.dlm will greedily match spaces, dashes, dots and slashes.
With format matchers you describe the concrete date formats that Quando will recognize. Within them you can include the date part matchers you defined previously.
Quando.config.day, Quando.config.month_num, Quando.config.month_txt, Quando.config.year, Quando.config.year2 can be used.
Quando.config.month_txt is a regexp that automatically combines all textual month matchers, and will thus match any month.
Quando.configure do |c|
# …some initial configuration…
c.formats = [
/^ #{c.day} #{c.dlm} #{c.month_txt} #{c.dlm} #{c.year} $/xi,
# compiles into something like
# /^ (?<day> …) [ -.\/]+ (?<month> jan|feb|…) [ -.\/]+ (?<year> …) $/xi
# and returns ~ #<MatchData "14 Apr 1965" day:"14" month:"Apr" year:"1965">
# on successful match
]
endQuando matches regular expressions from Quando.config.formats, in the specified order, against the input. If there is a match, the resulting MatchData object is analyzed.
If there is a named capture :day or :month, either is used in the result, given that they are within correct range. If the format matcher did not define such named group, 1 is used:
Quando.config.formats = [
/#{Quando.config.month_num}\.#{Quando.config.year}/
]
Quando.parse('04.2019') #=> #<Date: 2019-04-01>If there is a named capture :year, it is used in the result. If the format matcher did not define such named group, current UTC year is used. If the captured value is less than 100 (which is the case for years written as 2-digit numbers), Quando will use the Quando.config.century setting (defaults to 21), effectively converting, for example, 18 to 2018. Be mindful of this behaviour, adjusting Quando.config.century accordingly:
Quando.config.formats = [Quando.config.year]
Quando.parse('2019') #=> #<Date: 2019-01-01>
Quando.config.formats = [Quando.config.year2]
Quando.parse('65') #=> #<Date: 2065-01-01>
Quando.parse('65', century: 20) #=> #<Date: 1965-01-01>
# or
Quando.config.century = 20
Quando.parse('65') #=> #<Date: 1965-01-01>Out of the box, Quando will parse a reasonable variety of day-month-year ordered numerical and English textual dates. Some examples:
14.4.1965, 14/04/1965, …
14-apr-1965, 14 Apr 1965, …
April 1965, apr 1965, …
13.12.05, 13-12-05, …
April, APR, …
See Quando.config.formats for details.
You can configure Quando instances independently of each other and of the class:
Quando.parse('14-abril-1965') #=> nil
date_parser = Quando::Parser.new.configure do |c|
# …some configuration…
end
date_parser.parse('14-abril-1965') #=> #<Date: 1965-04-14>
Quando.parse('14-abril-1965') #=> nilor just pass a format matcher as a parameter:
m = /(?<year>#{Quando.config.year}) (?<day>\d\d) (?<month>[A-Z]+)/i
Quando.parse('1965 14 Apr', matcher: m) #=> #<Date: 1965-04-14>In both cases it will not change the global configuration (but note that calling setter methods on Quando.config will).
Ruby >= 1.9.3. Enjoy!