Monday, April 20, 2020

F# Active Patterns

My company deals with file parsing quite a bit, and I've been playing around with F# to see if I could write a better file parsing library using it instead of what we currently use in the C# world. This experimentation led me to F# Active Patterns.

F#, along with other languages, has a concept of Pattern Matching. Pattern matching allows a programmer to transform data by matching patterns in the shape of the data automatically, without a lot of if-then-else branching logic.

Pattern matching, using the match keyword, functions a lot like a switch statement in C#. A typical pattern matching statement in F# would look something like:

let filter1 x =
    match x
    | 1 -> printfn "The value is 1."
    | _ -> printfn "The value is not 1."

filter1 5.   // The value is not 1.
filter1 1.   // The value is 1.
This code defines a function, filter1, that takes an integer. It checks to see if the integer is a constant value of 1. If it is, it prints a message telling us it is 1. If it isn't, it tells us the value is not 1. This code illustrates a constant pattern, and their are many additional options. The other options are described here.

The example above, and the other basic pattern matching, is great, but not enough for some of my file parsing needs. Many of the files that we need to parse are fixed-width ASCII text files. Parsing the data requires defining fields in a particular order with a specified size. This aspect is unavoidable, but I wanted to find an elegant, way to approach this problem.

This is where I found Active Patterns. Active Patterns allow you to define a named pattern and apply it in a match statement. This pattern will match and parse the data if defined correctly. It is also the mechanism for matching with regular expressions, which lend themselves well to my file parsing problem. An Active Pattern looks like this:

let (|EmailMatchActivePattern|_|) input =
    let m = Regex.match(input, "(.*)@(.*)")
    if m.Success then Some m.Groups.[2].Value else None
This construct creates an Active Pattern named EmailMatchActivePattern and it takes in input to work against. Inside the definition, I'm using a regex for a basic email pattern (.*)@(.*). If the regex matches successfully, I return the second grouping. (The indexing starts at 0, but the 0 index contains the entire string that was tested with the regex. The following indices contain the matched classes from the regex.)

Now that I have a named pattern defined, I can use it in a normal pattern matching scenario to match a line from a file and parse the data in that line. I do that with code like:

let parseLine line =
    match line with
    | EmailMatchActivePattern (domainName) -> printfn "This is the domain in the email %s" domainName
    | _ -> printfn "Not an email addres."
This code parses a line of text to find an email address. If the line matches, it parses the domain portion of the email address into domainName. I can then reference domainName in the subsequent function call (a print statement). This code matches and parses the data at the same time.

I took this concept further to explore the file parsing that I described at the beginning. In this exploration, I read lines from a file, match the first two characters of the line to determine a record type, and then parse the line into tuples with individual values. This construct allowed me to define my file format as a couple of predefined regular expressions and to match the records and parse them at the same time. The source code for this is below.

Active Patterns is an F# construct that can be used to dynamically match and parse data. This is a powerful construct that can be used to build tools to parse files and other data structures with a minimal amount of code.

Monday, April 13, 2020

F# Sequence

When I was learning how to work with files, I was introduced to the seq keyword. This keyword is used to create a Sequence in F#. Sequences are powerful because they basically implement the IEnumerable<T> data structure in .NET.

IEnumerable<T> in .NET is an interface that exposes an enumerator of type T. This means that any data structure that implements IEnumerable<T> can be used in a loop to process the contents of the data structure. This construct is a powerful tool used for processing collections of things in the .NET space. Sequences are the manifestation of IEnumerable<T> in F#. Since Sequences are the generic implementation for collection data structures in F#, I need to understand them better.

When I was introduced to Sequences when working with files, I didn't fully understand how to iterate over the list and get direct access to the items in the collection. I was using the Seq.filter function to "iterate" over the list. Seq.filter will pass over each item in the Sequence, but it creates a new Sequence containing the items filtered by the function that is passed into it. I had incorrectly assumed it would output the filtered item. It does not. The filter function creates a new Sequence!

One approach that would have worked for my purposes was using a for loop. This construct loops over every item in a collection and executes the code in the containing block for every item in the collection. for loops work very well, but they generally aren't considered a good approach for a functional programming mindset (see the note below).

Many functional programming languages have a map function (or an equivalent). This function executes the code passed into it as a function on each element in the collection. F# has the Seq.iter function to do this iteration. This function iterates over each element in the Sequence, applying the function that was passed in to the element. This is what I was missing!

Now that I can simply iterate over a collection, I can easily read a file and print each line to the console. The following code does it:

open System
open System.IO

let readLinesFromFile fileName =
    seq { use reader = File.OpenText fileName
        while not reader.EndOfStream
            do yield reader.ReadLine() }
                
[<EntryPoint>]
let main argv =
    let lines = readLinesFromFile "test.txt"

    lines
        |> Seq.iter (fun x -> printfn "%s" x)
        
    0 // return an integer exit code

I feel better about my understanding of Sequences and how to use them in basic scenarios. It isn't everything I need to know, but I feel better equipped to use F#.

NOTE: Many functional programming languages don't implement looping constructs in the language definition. Those languages accomplish the same thing with a "map" function to iterate over a collection and sophisticated function passing to process the contents of the collection. Recursion is also common to avoid looping constructs. These approaches are considered more efficient in functional programming.

Friday, April 10, 2020

Learning F# - Odds and Ends

Now that I've gotten through my basic steps to "learn" a new programming language, I'm ready to actually start learning the language. What do I mean by this? When I studied martial arts, my teacher always said "You don't start learning something until you've done it 10,000 times." This seems like an exaggeration, but it's true!

When learning something, one needs time with it to truly learn it. One learning to play an instrument must practice regularly to get good at playing the instrument. Someone learning to cook has to cook regularly to get better at it. It is the same with learning a new programming language. You need to work with it regularly to get better at it.

Now, repetition is good, but that repetition needs feedback for improvement to happen. For me with programming, unit testing provides that feedback. Unit tests help me know if my solution works, if my code is too complicated, and if my design is moderately reasonable. This is why learning a unit testing framework in a new language is an early step for me.

The rest of the "learning" steps that I've documented are a means to an end. Knowing how to interact with files, or a database, or a web API in a new programming language provides me the tools to use the language in real-world scenarios.

The .NET ecosystem provides some easy, low-cost ways to use F# in real-world scenarios. First, the .NET command line tool provides a REPL (Read-Eval-Print-Loop) for F#. This tool allows you to run F# code interactively in a terminal window. You access the REPL with the command:

dotnet fsi
You can access packages using the #r directive. In the REPL environment, the #r directive references a DLL somewhere on disk, using the full path to the DLL. You can then use the Open keyword to make the namespace available from the referenced DLL.

The REPL also provides an ability to run F# scripts. An F# script file is designated with an fsx extension. It can be run from the command line with a call like:

dotnet fsi my_fsharp_script.fsx
This can be handy for day-to-day scripting tasks, and provides a way to practice with F#. (I documented a recent instance of this for me here.

As I continue to work with F# and really learn the language, I will continue to document what I'm learning.

Tuesday, April 7, 2020

Learning F# - Step 6

The next step in my learning journey for a new language is to figure out how to access web APIs.

When I searched for how to access web APIs, I was brought back to FSharp.Data. This library also provides interfaces to make HTTP calls! (I was a little leery about using this package after my experience with the data access!)

As it turns out, this library works really well for HTTP calls! Once I added it my project with the call:

dotnet add package FSharp.Data
and referenced the package in my code with:
open FSharp.Data
I was able to start accessing web sites.

Below is the most basic web request that you can make. In my case, I just pulled up Google and printed the HTML response to the console:

let google = Http.RequestString("http://www.google.com")
printfn "%s" google
This code calls "www.google.com" and prints the result to the console. This is a great start, but I need to be able to do more.

The next step for me is to figure out how to do more complex calls. That was fairly easy too. The following code calls Google with a specific search term and prints the result to the console:

let searchResults = Http.RequestString("http://www.google.com/search",
                                       httpMethod = "GET",
                                       query = [ "q", "butterflies"])
printfn "%s" searchResults
This code illustrates a few things that we need to know to access web APIs. First, it gives us a way to specify the HTTP verb: httpMethod = "GET". It also shows us how to send query parameters using query. The query parameter takes an array of items. If we needed to emulate a form submission, we can change the HTTP verb to POST and send in a body element with an array of values, just like the query parameter.

Now that I have learned some basic tasks to accomplish in F#, I can start using it to solve some real-world problems to learn the actual language. In my next post, I'll talk about some odds-and-ends I've found along this journey, as well as some good sites for diving into F# and getting a handle on functional programming.

Saturday, April 4, 2020

Learning F# - Step 5

The next step for me in learning a new language is interacting with a database. This step has kicked my butt!

Since F# is a Microsoft-sponsored language, I expected to find a straightforward approach to interacting with a SQL database. That wasn't the case!

When I searched for data access within F#, I was taken to this page. I wanted to work with a basic SQL database, so I clicked the link for SQL Data Access. I was pleasantly surprised to find several options for SQL data access. The first one I chose was FSharp.Data.SqlClient. I chose this one because it looked very similar to the SQL data access provided in C# by the System.Data.SqlClient namespace.

I jumped straight into implementing the sample code on the front page of FSharp.Data.SqlClient. As I built out the example and started to run it, I ran into errors that didn't make sense to me. As I read about this package, I realized it worked with System.Data.SqlClient, and I needed to add it to my program. I did that. I still had errors. As I read more about the errors I was seeing, I found that there is a dependency on mono for this package. In most cases, this wouldn't be a big deal, but it is for me at the moment.

I am doing most of my exploration of F# using .NET Core on a Macbook. As .NET has moved to support cross-platform development, they have relied on mono at times to supply functionality, so it isn't a surprise to learn that mono may be required for some packages still. For this particular learning step in my F# journey, I'm trying to minimize the complications and differences between F# on a Windows platform and cross-platform F#. To that end, I decided to use System.Data.SqlClient for this step.

The first thing I need to do is to set up a SQL database. For this purpose, I'm using SQL Server 2019 (actually running in a Docker image). I need to create a database and a table. Below are the T-SQL commands to do both:

create database FSharp

create table PlayingCard (
  ID int,
  CardValue varchar(5),
  Suit varchar(8)
)
Now that I have a database available, I can start working with it.

When I'm learning to use a new language and interact with a database, I try to do 2 basic operations: insert data and read data. Since I'm using System.Data.SqlClient and I'm familiar with this package in C#, I know that I need to create a SqlConnection, create a SqlCommand with my SQL statement, and then execute the command to perform the operation.

In F#, I start by making sure the package is available to my code with the open System.Data.SqlClient call. I then created my connection string for the database. I encountered the attribute Literal for the first time in F#. This annotation effectively creates a "constant" value. Now I'm ready to work with the database.

C# has a construct called using that allows a developer to open a resource (such as a file handle or database connection) in such a way that the compiler is told to dispose of the resource using the language construct instead of the developer calling it directly. It turns out, F# has the same construct!

I start my database interation by calling using (new SqlConnection(connString)). This instantiates a new SqlConnection. It then passes the connection to the function that I pass into the using call. In my case, I create an anonymous function by declaring it and creating it within the using block. An anonymous function is declared using the fun keyword, followed by any parameters that the function takes.

The first thing I do in my function is open the connection. The last thing I do is close the connection. This ensures that I don't abuse the connections available from the database.

Now, I want to put some data into the database so that I can try to read it back out. I create a new SqlCommand with my INSERT statement (values are hard coded) and the connection. I then execute the command with the ExecuteNonQuery call.

To read the data, I create a new SqlCommand with a SELECT statement to read all rows from my table and the connection. I execute the command with ExecuteReader and receive a SqlDataReader back. To get the values from the SqlDataReader, I iterate over the reader with a while loop and print the results to the console.

I started with an F# console app and added the data access package with the call:

dotnet add package System.Data.SqlClient
Here is the full code:
open System.Data.SqlClient

[<Literal>]
let connString = @"Data Source=.;Initial Catalog=FSharp;User=dbuser;Password=MyStrongPassword"
    
[<EntryPoint>]
let main argv =

    // System.Data.SqlClient
    using (new SqlConnection(connString)) ( fun conn ->
        conn.Open()
                                            
        let insertCmd = new SqlCommand("INSERT INTO PlayingCard (ID, CardValue, Suit) VALUES (3, '2', 'Diamonds')", conn)
        let count = insertCmd.ExecuteNonQuery()
                                            
        let readCmd = new SqlCommand("SELECT * FROM PlayingCard", conn)
        let reader = readCmd.ExecuteReader()
        while reader.Read() do
            printfn "%A %A %A" reader.[0] reader.[1] reader.[2]
            
        conn.Close()
    )
    
    0

If you are familiar with C# and interacting with a database in C#, this code is very comfortable and familiar. It also gives me another tool in my F# arsenal.

NOTE #1: F# has another construct for working with resources call use. It works similarly to using. I have not fully figured out the difference between the two versions.

NOTE #2: My "trick" to open and close the database connections in the same place is a common design pattern when working with resources such as files and databases. This pattern probably warrants its own blog post.

NOTE #3: I don't really like that I resorted to System.Data.SqlClient. It feels like I'm abusing the availability of C# libraries in the .NET space to get through a task. I need to do more research into F# data access to better understand the space and to find a more idiomatic approach to data access.

NOTE #4: Most of the code in this sample is procedural and not functional. I REALLY don't like this fact. I used it because it has the practical aspect that now I can start creating more meaningful applications with F#, but it hasn't helped me get better at thinking functionally.

Wednesday, April 1, 2020

Question about Automated Tests

I received a question from a friend about end-to-end testing. The question was: "Is it wise to try to build end to end test automation around an intermittently unstable web app?" Instead of sending him my answer directly, I thought I would put my answer in a blog post.

In my opinion, I think it is worth having an end-to-end test for the intermittently unstable web app. I don't think it is worth the effort to automate those tests and include them in part of a CI/CD pipeline. Here's why I think this.

I am a HUGE believer in automated testing. I think having an end-to-end test of a system is a good thing. It gives developers and testers an easy way to exercise changes to the application. In some cases, these tests can be used to verify changes to a production system. It is critical to have these tests so that the functionality is documented and can be exercised by anyone with access to the tests.

I don't think it is worth it to automate these tests against an intermittently unstable system. The key words here are "intermittently unstable." If the tests can run repeatedly with immutable results, I would say automate their execution. The trouble with some web apps is that they may take longer to respond in some areas based on time of day or operation performed, causing inconsistent test results due to time outs. It can cost a lot of time and effort to keep this automation running and test results consistent. If this is the case, I don't think the automation effort is worth the time.

I hope this answers my friend's question. If not, I'll keep the thread going here.