todatetime - version 1.1
Stata usually imports dates from .xlsx, .csv, and .txt files as string variables. todatetime converts these strings to datetime format in one easy to remember line of code.
Installing and updating
To install the current version of todatetime, use the following code to search for zach.prof in Stata:
search zach.prof
Next click the link for todatetime. Then click the link to install todatetime.
After installation, you can type help todatetime in Stata to view a comprehensive help file. You can also read the remainder of this webpage for examples that illustrate all of todatetime's features.
If you have an old version of todatetime on your computer and want to update to the current version, run the following code in Stata to uninstall the old version:
net uninstall todatetime
Then run the search command from above and follow the links to install the current version.
You can alternatively use the following net install commands to install any current or non-current version of todatetime:
net install todatetime, from("https://raw.githubusercontent.com/zachprof/todatetime/1.1") // installs version 1.1 (current)
net install todatetime, from("https://raw.githubusercontent.com/zachprof/todatetime/1.0") // installs version 1.0
Getting started
To demonstrate todatetime's functionality, I am going to use a dataset I created of random dates. This dataset will be accessible in Stata after installing todatetime. The code in my examples can be accessed by downloading todatetime examples.do.
To load the dataset of random dates, type the following two commands in Stata:
clear
sysuse randomdates
After loading the dataset, take a moment to look at the variables. There should be four variables, each with 10 observations. Each variable contains different dates that you might encounter in your research.
Variable descriptions:
date1 = random string dates in day, month, year format with no separators (e.g., "08jan2010")
date2 = random string dates in day, month, year format with hyphen separators (e.g., "1-Feb-2006")
date3 = random string dates in month, day, year format with forward slash separators (e.g., "6/14/2013")
date4 = random string dates in year, month, day format with no separators (e.g., "2010jan14")
Converting these variables to datetime format is as easy as typing "todatetime" followed by the name(s) of the variable(s) you want to convert, as follows:
todatetime date1 date2 date3 date4
If you open the data after running this code, you'll see that date1, date2, date3, and date4 are all successfully converted to datetime format and the old string variables are saved with the suffix "_string" appended to their names.
It is worth noting that todatetime uses a fairly unsophisticated method of detecting the format of input data. Therefore, in limited cases, todatetime will not be able to automatically convert strings to datetime format. If this occurs, an error message will appear, and you will need to use the infmt option (explained in the next section).
This concludes my getting started tutorial. You can continue reading for more examples that demonstrate todatetime's options. You can also type help todatetime in Stata to view a comprehensive help file containing all the information you need to put todatetime to work in your next research project.
Options
todatetime has the following four options:
infmt: defines the order of days, months, and years in the input data
suffix: changes the suffix appended to string variables
undo: undoes changes previously made by todatetime
erasestring: Erases dates formatted as strings; cannot be reversed with undo option
The infmt option is required when todatetime cannot automatically detect the format of input data. The most common case when this would occur is if a date does not specify the century (e.g., June 14, 2013 is recorded as "6/14/13" rather than "6/14/2013).
To illustrate, I've prepared a dataset of random dates that do not specify the century. This dataset will be accessible in Stata after installing todatetime and you can load it with the following code:
clear
sysuse nocenturydates
To see what happens when todatetime can't automatically detect the format of input data, try the following code:
todatetime date
This will produce the following warning message:
Warning: date could not be converted to datetime format automatically;
use infmt to manually specify M, D, and [##]Y
As the warning suggests, you need to specify the order of months, days, and years, and the century, using M (for month), D (for day), and [##]Y (for year with century). If you take a look at the data, you'll see date is formatted with the month first, then the day, followed by the year, using forward slash separators (e.g., "6/14/13"). Therefore, to successfully convert the variable to datetime format, you would need to run the following code (assuming the correct century is 2000):
todatetime date, infmt(MD20Y)
Now let's load the dataset of random dates used in the last section to explore todatetime's other options. You can do this with the following code:
clear
sysuse randomdates
The next option is suffix, which allows you to change the suffix appended to the original string variable. To see how it works, try the following code:
todatetime date1, suffix(_s)
You'll see after running this code that the string version of date1 is named date1_s rather than date1_string.
You can easily undo this and just about any other change todatetime makes with the undo option. However, to ensure this option works correctly, you need to leave options specified as they were when the original change was made. For example, the changes made to date1 can be undone with the following code:
todatetime date1, suffix(_s) undo
It's important to note that undo only works one variable at a time. Therefore, if you change multiple variables, the only way to change them back is one at a time. For example, try the following code:
todatetime date2 date3 // make changes
todatetime date2 date3, undo // will result in error
todatetime date2, undo // successfully undoes changes to date2
todatetime date3, undo // successfully undoes changes to date3
The last option in todatetime is erasestring, which erases the string variable after converting it to datetime format. However, after a change is made using erasestring, it cannot be undone with the undo option. To see erasestring in action, try the following code:
todatetime date1 date2 date3 date4, erasestring
If you've read everything on this page, congratulations! You know all of todatetime's options. Remember you can type help todatetime in Stata any time you need a refresher.
Happy coding!
Acknowledgements: I owe a special thanks to Diana Weng and Erika Wheeler for carefully reviewing and providing helpful feedback on version 1.0 of this code, the complementary help file, and the documentation on this page.
Code access: The files underlying every version of this code as well as information about what changed from one version to the next are readily accessible on GitHub.