It’s time to treat the star database like a database. Well, it’s time to treat it like something we can search, anyways. I know this is not a trivial task in Parrot, so the Babysteps have been building up to it slowly. First, we figured out how to read from the database file and display its contents in a meaningful fashion. Second, we added subroutines to massage the data a little and produce some usable names for the thousands of stars in the database that do not have proper names. Then we suddenly spun off in a seemingly random direction, talking about testing and setting up larger projects. That was intentional, though. This sort of project requires more work in Parrot than in a language like Perl, due to Parrot’s lower-level nature. I wanted to be sure we could test this application as we add search functionality. We just pushed our script into the Stellar project to get that testing foundation.
First I am going to describe what is being built, then I’m going to work on the very important detail of examining a single star. We’ll have to wait until the next Baby Step before we start searching the catalog.
Thinking Through The Problem
We could start by creating a simple search function, testing every line in the HYG Catalog and verifying the results by hand. That will obviously not work. Part of the reason we have been writing this is because the catalog is not easily read without a little computer help.
It might be a better idea to take a smaller set and search against it. How small is good enough? A thousand? A hundred? Ten? We could reasonably start by searching a set of one star. It is certainly easy to manage. Okay, so we’ll start with a set of one star.
What do I mean when I talk about searching through the catalog? The basic idea is that we have a catalog of stars and some conditions, like “the ProperName is ‘Sol’.” We build a list of matches by examining each star, seeing if the conditions are true for that star. If they are, then that star goes in the list of matches. Either way, we then move on to the next star.
How do we determine if a single star is a match? Okay, we have a single star and some conditions. We determine if the star is a match by testing each condition. We can stop testing when we find a condition that is not true or we have run out of conditions to test. We know that this star is a match if every condition we tested was true.
There are a lot of technical details that we would think about if this code was intended for use in the real world. Large match lists could use a lot of memory. The conditions would need to allow for ranges or approximate matches. Luckily, this code is not indented for use in the real world. I can be as clumsy as I want, as long as my program gives the right answer.
Writing Code
Now we’re ready to go back to the stellar
project and write some code.
Testing a Condition
Our first approach to checking a star’s details will be to check a single field.
The star ProperName
is a good field to start with:
example-0b-01/t/04-check-star.t
.include ‘lib/stellar.pir’
.sub ‘main’
.include ‘test_more.pir’ .local string header_string .local string star_string .local string delimiter .local pmc header_fields .local pmc starheader_string = “StarID,HIP,HD,HR,Gliese,BayerFlamsteed,ProperName,RA,Dec,Distance,PMRA,PMDec,RV,Mag,AbsMag,Spectrum,ColorIndex,X,Y,Z,VX,VY,VZ” delimiter = ”,” star_string = “0,,,,,,Sol,0,0,0.000004848,0,0,0,-26.73,4.85,G2V,0.656,0,0,0,0,0,0”
plan(2)
header_fields = split delimiter, header_string star = extract_from_csv_line(star_string, header_fields, delimiter) $S0 = ‘Sol’
$I0 = check_star_proper_name(star, ‘Sol’) ok($I0, ‘Sol should have ProperName of “Sol”’) $I0 = check_star_proper_name(star, ‘Arcturus’) nok($I0, ‘Sol should not have ProperName of “Arcturus”’) .end
check_star_proper_name
is an easy sub to write. My version is more verbose
than necessary, to be honest:
example-0b-01/lib/stellar.pir
…
.sub check_star_proper_name .param pmc star .param string desired_value .local string actual_value .local int check_result
actual_value = star[‘ProperName’] check_result = desired_value == actual_value .return(check_result) .end
The sub returns the result of comparing our desired ProperName
with the actual
value held in the star
. I rely on Parrot to do the right thing when comparing
desired_value
with actual_value
. Remember that Parrot automatically
handles any type conversions, so we can ignore type for now.
Revisiting the header_string
I want to stop for a moment and look at my tests. One annoying fact is that
every single test file includes the full header_string
and delimiter
. That
is explicit behavior, which I like. Then again, it is also cluttering up my
tests. The headers never change, yet I always include them. What if I could make
the header string and delimiter optional?
I can make those parameters optional. Let’s reopen the test file 03-extract-from-csv.t
.
example-0b-02/03-extract-from-csv.t
.sub main '
'…
plan(4)
…
star = extract_from_csv_line(star_string, header_fields) $S0 = summarize_star(star) is($S0, summary, “delimiter should be optional”)
star = extract_from_csv_line(star_string) $S0 = summarize_star(star) is($S0, summary, “header_fields should be optional”) .end
How do we make those fields optional? We use the parameter modifiers :optional
and :opt_flag
.
.sub extract_from_csv_line
.param string star_string
.param pmc header_fields
if has_delimiter goto CHECK_HEADER_FIELDS delimiter = ’,’ CHECK_HEADER_FIELDS: if has_header_fields goto BEGIN_EXTRACTING .local string header_string header_string = “StarID,HIP,HD,HR,Gliese,BayerFlamsteed,ProperName,RA,Dec,Distance,PMRA,PMDec,RV,Mag,AbsMag,Spectrum,ColorIndex,X,Y,Z,VX,VY,VZ” header_fields = split delimiter, header_string
BEGIN_EXTRACTING: .local pmc star_fields .local pmc star
star_fields = split delimiter, star_string star = extract_star_details(header_fields, star_fields)
.return(star) .end
The :optional
modifier makes sense. Use it to tell Parrot that a particular
parameter is not required for the sub to perform its duties. :opt_flag
might
require a little bit of explanation, though. It is a bookkeeping parameter
provided by Parrot to let you know whether or not the preceding optional
parameter was provided by the caller. You test the flag to see if the optional
parameter was set. The name of the flag doesn’t matter.
example-0b-03.pir
.sub ‘main’
.local string eggs .local string topping .local string ordereggs = ‘over easy’ topping = “Frank’s RedHot”
order = breakfast(eggs, topping) say order
order = breakfast(eggs) say order .end
.sub breakfast
.param string eggs
.param string topping
.local string breakfast_order breakfast_order = ‘Eggs cooked ’ . eggs
unless has_stuff goto SERVE_BREAKFAST breakfast_order .= ’ topped with ’ breakfast_order .= topping
SERVE_BREAKFAST: .return(breakfast_order) .end
This program executes without a hitch. There is nothing important about the name of the flag.
$ parrot example-0b-03.pir Eggs cooked over easy topped with Frank’s RedHot Eggs cooked over easy
Order does matter, though. You always want to put the flag after the optional
parameter in your .param
directives, or bad things will happen.
example-0b-04.pir
.sub ‘main’
.local string eggs .local string topping .local string ordereggs = ‘over easy’ topping = “Frank’s RedHot”
order = breakfast(eggs, topping) say order
order = breakfast(eggs) say order .end
.sub breakfast
.param string eggs
.param int has_stuff
.local string breakfast_order breakfast_order = ‘Eggs cooked ’ . eggs
unless has_stuff goto SERVE_BREAKFAST breakfast_order .= ’ topped with ’ breakfast_order .= topping
SERVE_BREAKFAST: .return(breakfast_order) .end
See?
$ parrot example-0b-04.pir Eggs cooked over easy too few positional arguments: 1 passed, 2 (or more) expected current instr.: ‘breakfast’ pc 34 (example-0b-04.pir:19) called from Sub ‘main’ pc 26 (example-0b-04.pir:15)
Back to stellar
. extract_from_csv_line
can work the headers out for itself now. Let’s clean up
our test code.
example-0b-05/t/04-check-star.t
.sub ‘main’
.include ‘test_more.pir’ .local string header_string .local string star_string .local string delimiter .local pmc header_fields .local pmc starstar_string = “0,,,,,,Sol,0,0,0.000004848,0,0,0,-26.73,4.85,G2V,0.656,0,0,0,0,0,0” star = extract_from_csv_line(star_string)
plan(2)
$I0 = check_star_proper_name(star, ‘Sol’) ok($I0, ‘Sol should have ProperName of “Sol”’) $I0 = check_star_proper_name(star, ‘Arcturus’) nok($I0, ‘Sol should not have ProperName of “Arcturus”’) .end
It is a little easier now to tell what I am actually testing in this code. Good.
I know I should be all methodical and orderly about checking my star fields, but
I have not had enough sleep for that to be practical. Let’s check Spectrum
.
example-0b-06/t/04-check-star.t
.sub ‘main’
…
plan(4)
…
$I0 = check_star_spectrum(star, ‘G2V’) ok($I0, ‘Sol should have Spectrum of “G2V”’) $I0 = check_star_spectrum(star, ‘K3V’) nok($I0, ‘Sol should not have Spectrum of “K3V”’) .end
Meanwhile, in stellar.pir
:
example-0b-06/lib/stellar.pir
.sub check_star_spectrum .param pmc star .param string desired_value .local string actual_value .local int check_result
actual_value = star[‘Spectrum’] check_result = desired_value == actual_value .return(check_result) .end
This works perfectly, but compare check_star_spectrum
to
check_star_proper_name
. They are almost identical. In fact, the only difference between
the code for the two subs is which field gets grabbed for actual_value
.
It seems to me that the same behavior could be described by a single sub.
example-0b-06/t/04-check-star.t
.sub ‘main’
…
plan(8)
…
$I0 = check_star_field(star, ‘ProperName’, ‘Sol’) ok($I0, ‘Sol should have ProperName of “Sol”’) $I0 = check_star_field(star, ‘ProperName’, ‘Arcturus’) nok($I0, ‘Sol should not have ProperName of “Arcturus”’) $I0 = check_star_field(star, ‘Spectrum’, ‘G2V’) ok($I0, ‘Sol should have Spectrum of “G2V”’) $I0 = check_star_field(star, ‘Spectrum’, ‘K3V’) nok($I0, ‘Sol should not have Spectrum of “K3V”’) .end
check_star_field
looks like a generic version of check_star_proper_name
and
check_star_spectrum
.
example-0b-06/lib/stellar.pir
…
.sub check_star_field .param pmc star .param string field .param string desired_value .local string actual_value .local int check_result
actual_value = star[field] check_result = desired_value == actual_value .return(check_result) .end
All tests are still passing. You are working along with me and running tests,
right? It’s time to decide what to do with those specific subs now that we have
a nice general purpose field checker. You could argue that those subs should
stick around but as wrappers that call check_star_field
. That is entirely
reasonable. I am comfortable using the general purpose sub as my main checker,
though. I will delete the specific subs and their tests in my code.
Testing Multiple Conditions
stellar
does a convincing job of checking a single field in a star. The next
step is figuring out how to check multiple fields.
example-0b-07/t/04-check-star.t
.include ‘lib/stellar.pir’
.sub ‘main’
.include ‘test_more.pir’ .local string header_string .local string star_string .local string delimiter .local pmc header_fields .local pmc starstar_string = “0,,,,,,Sol,0,0,0.000004848,0,0,0,-26.73,4.85,G2V,0.656,0,0,0,0,0,0” star = extract_from_csv_line(star_string)
plan(9)
$I0 = check_star_field(star, ‘ProperName’, ‘Sol’) ok($I0, ‘Sol should have ProperName of “Sol”’) $I0 = check_star_field(star, ‘ProperName’, ‘Arcturus’) nok($I0, ‘Sol should not have ProperName of “Arcturus”’) $I0 = check_star_field(star, ‘Spectrum’, ‘G2V’) ok($I0, ‘Sol should have Spectrum of “G2V”’) $I0 = check_star_field(star, ‘Spectrum’, ‘K3V’) nok($I0, ‘Sol should not have Spectrum of “K3V”’)
$I0 = check_star(star, ‘ProperName’, ‘Sol’) ok($I0, ‘Sol should have ProperName “Sol”’) $I0 = check_star(star, ‘ProperName’, ‘Sol’, ‘Spectrum’, ‘G2V’) ok($I0, ‘Sol should have ProperName “Sol” and Spectrum “G2V”’) $I0 = check_star(star, ‘ProperName’, ‘Arcturus’, ‘Spectrum’, ‘G2V’) nok($I0, ‘Sol should not have ProperName “Arcturus” and Spectrum “G2V”’) $I0 = check_star(star, ‘ProperName’, ‘Sol’, ‘Spectrum’, ‘K3V’) nok($I0, ‘Sol should not have ProperName “Sol” and Spectrum “K3V”’) $I0 = check_star(star, ‘ProperName’, ‘Arcturus’, ‘Spectrum’, ‘K3V’) nok($I0, ‘Sol should not have ProperName “Arcturus” and Spectrum “K3V”’) .end
Yes, I did go through several variations on testing multiple fields. I wanted to
make sure that check_star
behaved exactly the way I expected. How am I going
to make those tests succeed?
:slurpy
and :flat
Save the Day
Parrot gives us two excellent modifiers that simplify the job of checking
multiple fields. The :slurpy
param modifier to effectively say “Oh,
there might be some other params. Just put them in an array.” The flip side of
that is the :flat
modifier in subroutine calls, which lets us say “Oh, I have
this array of stuff. Just tack its elements to the param list for the sub I’m
about to call.”
Enough imaginary dialog. Here’s what check_star
looks like.
example-0b-07/lib/stellar.pir
.sub check_star
.param pmc star
.param string field
.param string desired_value
.param pmc extra_fields
match_result = check_star_field(star, field, desired_value)
We’re done if this match fails.
unless match_result goto RETURN_RESULT
extra_field_count = extra_fields
We’re done if there are no extra fields.
unless extra_field_count goto RETURN_RESULT
Grab the result of checking the extra fields.
match_result = check_star(star, extra_fields
RETURN_RESULT: .return(match_result) .end
I tried to make it clear what was going on the comments. check_star
is given a
star and a handful of conditions. It only cares about the first condition, and
stuffs the rest into extra_fields
. If the first condition fails, we’re done.
There’s no point in checking any more fields, so jump down to return the failure.
If there are no other conditions to check - which we determine by looking at
the size of extra_fields
- we return the result, which should be a success.
If there are more conditions to check, we call check_star
again, using the
star and the conditions we had stuffed into extra_fields
. That goes through
the same process of testing and looking for extra conditions until it has
completed the last test. The result of all the completed tests is handed back to
check_star
, which then hands it back to us.
This process of recursion - solving a complex problem by breaking it down into small problems - is common in many programming languages, so it is supported by Parrot. After all, Parrot is supposed to be used for creating new languages. It is not just for browsing some guy’s star catalog.
The “tail call” pattern of evaluating a sub and immediately returning its result is so
common that Parrot provides the .tailcall
directive to optimize its
behavior. .tailcall
essentially tells Parrot to immediately return the result
of evaluating the sub rather than storing it in memory. It may not do much in a
case like check_star
, but you should see a difference in complex problems.
Then again, it might help check_star
immensely. I don’t really know about
Parrot optimization effects yet.
example-0b-08/lib/stellar.pir
.sub ‘main’
…
Return the result of checking the extra fields.
.tailcall check_star(star, extra_fields
RETURN_RESULT: .return(match_result) .end
Conclusion
We can now examine multiple fields to determine if a star matches a description we’ve provided. Along the way, we explored subroutine parameters. We learned how to make a parameter optional. We learned how to grab all of a subroutine’s params and stuff them into a list. We also learned how to paste the contents of a list onto the parameters of a subroutine call. We even dabbled in recursion, the fine art of breaking a big problem down with small solutions.
In our next step, we will use check_star
to search through sets of stars.
That’s the part I’ve been looking forward to for a while!
In the meantime, continue your own Parrot exploration. You might review the Parrot Subroutines chapter a little more. I hardly left that page while working through this step. Whatever you do, remember to have fun!