Our search journey continues.
We have accomplished the hard part: checking a single star to see if it has the traits we’re looking for.
Today we just have to use that logic to search a set of stars.
First we’ll examine a handpicked selection.
Guess what happens after that?
We finally get back into the full HYG Catalog and search for stars from the command line.
After all this work,
stellar grows up and becomes an application.
Building a Catalog and Searching It
The first thing that’s tripping me up is how to set up the catalog itself.
You know the "set of stars" I was talking about?
The easy way to do this from a test is to have a few CSV strings for some sample stars, apply
extract_from_csv to each of them, push each star into an array, then search through the array.
Thing is, I know that this is not going to be acceptable when I get to the real data.
I expect this application to be one where you run it from the command line, using your search conditions as command line arguments.
Loading all the data before searching it takes time.
I should write this code so that it searches while reading in data.
That would be much faster.
On the other hand, what if I add an interactive prompt to this application later? Loading the full catalog into memory before applying searches could be faster in the long run compared to reading the data file for every search.
That is trying to predict the future, though. I know how I want to use this catalog today. I want to run a search and see the results as soon as the application knows about them.
I will share a secret. I spent a day writing the "load then search" approach to building the catalog. Guess what? It is unbearably slow at my current Parrot skill level. I am confident that this is only slow because my code overall is simplistic. Maybe I can revisit this idea after learning more about Parrot.
Searching The Catalog
I do not want to dig right into searching the full 119,617 entries of the real catalog. Instead, let’s set up a small test catalog and write some tests.
Where you put your test data is a matter of taste.
I will be keeping my data in a folder named
That seems reasonable.
$ mkdir data
Only a few entries are needed in the test catalog. We just need to be sure that the code works with a CSV file with the same structure as the HYG database. I’ll grab Sol, another G2V spectrum star, and a K3V star.
The test data is out of the way, so now I feel comfortable writing the tests that use it.
I am deliberately keeping the tests simple right now. The goal is to make sure the basic functionality works rather than to guarantee behavior for every little detail. Tests can be added for those details as they become important.
search_catalog sub borrows quite a bit from step 07.
search_catalog will handle the task of reading the file and looking for stars that match the search conditions it has been given.
After it defines a star from the current line, it asks
check_star to compare that star to the set of conditions it has been given.
It remembers the stars that match, and returns them once it has reached the end of the file.
It is not the fastest approach, but it works.
It works well enough that I am ready to add real data and some way for people to use it!
Searching From The Command Line
Now that we know
stellar can read a CSV and return results, it’s time to work on that empty
main that has been sitting in
Oh yeah - we will want to make
hygxyz.csv available now.
I will be pushing my copy into the
data folder, next to
You can place your copy wherever you like, but make sure that you set the path appropriately in
Here is the result of all that work we have done setting up the project and support code.
The main subroutine in
stellar is downright civilized compared to what we had for step 07.
All we do is search based on the command line parameters and display each of the matches.
$ parrot lib/stellar.pir Spectrum G2V ColorIndex 0.656 <Name: Sol, Spectrum: G2V, Distance: 0.000004848> <Name: HD 7186, Spectrum: G2V, Distance: 112.359550561798> <Name: HD 140235, Spectrum: G2V, Distance: 60.1684717208183> <Name: HD 169019, Spectrum: G2V, Distance: 108.108108108108> 4 matches.
Hey, this thing is almost useful!
stellar has reached a major milestone.
When I started fiddling with the HYG Database, I wanted to write a command-line Parrot tool that could look up stars based on specific fields.
This step gives us that ability.
I admit that a lot more could be done.
For example, it only does exact matches.
You can easily find a star that is
108.108108108108 light years away, but not stars that are roughly
108 light years away.
And forget about finding stars within 20 light years.
I am going to take a little break from the
stellar project, though.
Rakudo Star is almost out, and I want to play with that.
You can add to
Make it faster.
Make it object-oriented.
Make it a library.
Rewrite it in LOLCODE.
Just remember to give David Nash credit for creating the HYG Database.
We have been having all of this fun because he took the time to put that catalog together.