All the types that we used so far were either already provided by the standard Prelude
, such as Float
, Char
, Double
, or Int
, or they were lists or tuples of these types. We also used type synonyms to define more convenient and descriptive names, as in
type Point = (Float, Float)
type Line = (Point, Point)
While type synonyms are convenient, they only associate new names with existing types. Sometimes we need entirely new types to be able to model data properly. For example, to create more varied graphics, we might want to extent the Line
type with a line style; then, instead of always drawing solid lines, we can offer dotted and dashed lines as alternative styles. How can we represent this attribute? We might be tempted to use strings and define
type LineStyle = String
type FancyLine = (Point, Point, LineStyle)
and use it as in
myLine :: FancyLine
myLine = ((0, 0), (1, 1), "dashed")
Unfortunately, this has a number of serious drawbacks. For one, it's easy to introduce mistakes that will lead to unpredictable behaviour — we might accidentally write
myLine :: FancyLine
myLine = ((0, 0), (1, 1), "bashed")
which (as far as the compiler is concerned) conforms to the type definition. After all, "bashed"
is a String
, even if it is not a valid line style. Consequently, each function using the line styles has to check —at runtime— if the string is a valid style. Here is an example of such a function (where the function elem
checks if an item is an element of a given list):
changeLineStyle :: FancyLine -> LineStyle -> FancyLine
changeLineStyle (x, y, _) newStyle
| newStyle `elem` ["solid", "dashed", "dotted"] = (x, y, newStyle)
| otherwise
= error $ "error in changeLineStyle: " ++ newStyle ++ " is not a valid style"
This is unsatisfactory! It is a hassle to implement all the error checking — and easily forgotten, too. It slows down code execution (string comparison is a relatively time consuming operation), but worst of all: invalid line styles cannot be spotted by the compiler before we run the program; instead, we get runtime errors.
A much better approach is to introduce a new data type that comprises exactly the three admissible line style value, so that the compiler can reject any program that attempts to use an illegal value. In Haskell, the keyword data
introduces the definition of a data type. It is followed by the name of the new type and its values (separated by |
characters), as in
data LineStyle
= Solid
| Dashed
| Dotted
The identifiers representing the elements of the new type (here, Solid
, Dashed
, and Dotted
) are called the data constructors of type LineStyle
. Just as type names, they must start with an uppercase letter. Now, we can define
myLine :: FancyLine
myLine = ((0, 0), (1, 1), Dashed)
and, if we misspell,
myLine :: FancyLine
myLine = ((0, 0), (1, 1), Bashed) -- Error: unknown data constructor
the compiler will be able to assist us by pointing out that the data constructor Bashed
is unknown. If we accidentally start the data constructor with a lower case letter, as in
myLine :: Line
myLine = ((0,0), (1, 1), dashed) -- Error: identifier not in scope
the compiler will treat it as a variable and report that dashed
is not in scope. Moreover, the check for a runtime error in our previous definition of changeLineStyle
is now obsolete and we can simply write
changeLineStyle :: FancyLine -> LineStyle -> FancyLine
changeLineStyle (x, y, _) newStyle = (x, y, newStyle)
All around, a better solution!
Types, such as LineStyle
, which consist of fixed set of simple data constructors, are called enumeration types. Another example of an enumeration type are the days of the week:
data Day
= Sunday
| Monday
| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
deriving (Enum)
In addition to defining the new data type Day
, the deriving
clause in this example instructs the Haskell compiler to include the new type Day
into the standard type class Enum
— the code for the methods of that type class are automatically synthesised by the Haskell compiler for us. Hence, we can now make use of Haskell's range syntax for values of Day
— for example,
[Monday .. Friday]
⇒[Monday, Tuesday, Wednesday, Thursday, Friday]
Another well-known enumeration type is Bool
, which is not a somewhat magical builtin type, but —in the Prelude
— simply defined as
data Bool = False | True
To summarise, definitions of enumeration types have the general form
data <TypeName> = <DataConstructorName1> | ⋯ | <DataConstructorNameN>
deriving <Classes>
where the deriving
clause is optional and the name of the new type as well as all the names of data constructors need to start with an uppercase letter.
Values of enumeration types are often scrutinised by way of pattern matching. For example, let us write a function that checks whether a given Day
is a week day. If it is a week day, we want to return True
; otherwise, False
.
isWeekday :: Day -> Bool
isWeekday Sunday = False
isWeekday Saturday = False
isWeekday _ = True
With this definition, we have
isWeekday Monday
⇒True
Pattern matching by way of multiple equations in a function definition (such as in the definition of isWeekday
) is merely a convenient shorthand for the more verbose, but also more flexible case expressions, which can occur anywhere in an expression (not just in the argument position of a function definition). Hence, we can rewrite isWeekend
as follows:
isWeekday :: Day -> Bool
isWeekday day = case day of
Sunday -> False
Saturday -> False
_ -> True
Case expressions have the following general form
case <Expr> of
<Pattern1> -> <BodyExpr1>
<Pattern2> -> <BodyExpr2>
⋯
where <Expr>
can be an arbitrary expression. Evaluation attempts to match the value of <Expr>
against the patterns from top to bottom, selecting the first alternative whose pattern matches. Evaluation, then, continues with the body expression of the selected alternative. If no pattern matches, evaluation raises a runtime exception signalling a pattern match failure. In the definition of isWeekday
, the pattern of the last alternative is an anonymous variable, which matches all values — it effectively serves as the default case.
Case expressions are another example of Haskell notation where the indentation matters: all the patterns need to be indented to the same level (c.f., Layout). However, we can always opt to group the patterns explicitly using curly braces, separated by semicolon. It is usually considered good style to nevertheless align the patterns, such as in
isWeekday :: Day -> Bool
isWeekday day = case day of
{ Sunday -> False
; Saturday -> False
; _ -> True
}
With explicit braces and semicolon, layout is no longer required — i.e., the following code is also correct, albeit not particularly readable:
isWeekday :: Day -> Bool
isWeekday day = case day of {Sunday -> False ; Saturday -> False; _ -> True}
Even simple functions, such as isWeekday
, can usually be implemented in a variety of ways. As an alternative to the explicit pattern matching in isWeekday
, we could construct a list containing the two days of the weekend, and then, test —with the Predude
function elem
— whether the argument to isWeekday
is contained in that list. This leads to a compact one-liner:
isWeekday :: Day -> Bool
isWeekday day = not $ day `elem` [Saturday, Sunday]
Unfortunately, it gives us a compiler error of the form
No instance for (Eq Day) arising from a use of ‘elem’
In the second argument of ‘($)’, namely
‘day `elem` [Saturday, Sunday]’
In the expression: not $ day `elem` [Saturday, Sunday]
In an equation for ‘isWeekday’:
isWeekday day = not $ day `elem` [Saturday, Sunday]
What went wrong? The type of elem
is
elem :: Eq a => a -> [a] -> Bool
The function elem
needs to compare its first argument with the elements of the list for equality. As we discussed in A First Glance at Overloading, this requires the type of compared data items to be a member of the type class Eq
. Unfortunately, that is not the case for Day
.
How can we fix this? In general, if we want a type to be a member of a type class, we have to provide a definition for the functions constituting the methods of the type class. Such a definition is called a type class instance for that particular type and we will discuss this topic in detail in a subsequent chapter.
However, for some type classes, the compiler can derive (that is, generate) a standard definition automatically. We have encountered this before in the form of the deriving clause for the Enum
type class. The derivation process is quite straight forward when generating Eq
instances for enumeration types. Presented with a new enumeration type of the form
data <TypeName> = <DataConstructorName1> | ⋯ | <DataConstructorNameN>
the compiler simply generates
<DataConstructorName1> == <DataConstructorName1> = True
⋯
<DataConstructorNameN> == <DataConstructorNameN> = True
_ == _ = False
Hence, if we define Day
as
data Day
= Sunday
| Monday
| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
deriving (Eq, Enum)
we can immediately use ==
, and hence, also elem
on Day
values — thus, our one-liner for isWeekday
will now work as intended. The second method defined by Eq
is inequality /=
and that is generated in an analogous manner.
In addition to Eq
and Enum
, it is common to derive instances for Ord
(to obtain comparison operators, such as <
, <=
, and so on) as well as Show
(to be able to use show
, print
, and similar).
Whether you want to derive Eq
, Ord
, and Enum
depends very much on how you use the values of a particular data type; in contrast, deriving Show
is almost always a good idea to help with experimentation and debugging. Without being a member of Show
, neither Haskell for Mac nor GHCi can display values of a user-defined type. Hence, we almost always add at least deriving (Show)
to a data
defnition.
In Spirals, Snowflakes & Trees: Recursion in Pictures, we defined a simple interface for line graphics including the following type synonym definitions:
-- LineGraphics Interface
type Point = (Float, Float)
type Vector = (Float, Float)
type Line = (Point, Point)
type Colour = (Int, Int, Int, Int) -- red, green, blue, opacity
Type synonyms are convenient. They shorten type signatures (e.g., we can write Colour
instead of (Int, Int, Int, Int)
) and they document the meaning of function arguments (e.g., they let us decide whether a particular use of the type (Float, Float)
represents a Point
or a Vector
). They however do not improve type safety. More specifically, if we accidentally pass a value of type Point
instead of a Vector
, the Haskell compiler will not complain — after all, in either case, we have got a pair of two floating-point values.
In other words, type synonyms give new names to existing types, but during type checking, the Haskell compiler does not distinguish between the new type name and the existing type that the new name refers to. To illustrate this, let us consider a function that moves a point by n
-times a given vector:
movePointN :: Float -> Vector -> Point -> Point
movePointN n (vx, vy) (x, y) = (n * vx + x, n * vy + y)
Now, if we accidentally pass the arguments in the incorrect order, as in movePointN 5 point vector
, the Haskell system will happily accept our program and perform a computation that we did not intend.
To avoid this problem —and thus, to increase type safety— a type synonym is not sufficient, we need to introduce a new data type instead:
data Point = Point Float Float
deriving (Show, Eq)
Just as our previous type synonym to represent points, this data type definition introduces the new type name Point
. In addition, it also introduces a new data constructor, which is also named Point
. This new data constructor gets two parameters (or arguments), each of type Float
; hence, we can construct a value of type Point
as follows:
zeroPoint :: Point
zeroPoint = Point 0 0 -- point at (0, 0)
The type signature of the data constructor is
Point :: Float -> Float -> Point
Here, the Point
on the left is the data constructor (a value) and the Point
on the right is the newly introduced type. We could also use different names for the type and the data constructor — e.g., we could call the data constructor MkPoint
instead. However, it is a common idiom in Haskell to use the same name if a type has only got a single data constructor. After all, there is no risk to confuse them — one is a value and the other is a type, and types and values are kept strictly separate in Haskell.
The general form of a data type definition with just one data constructor is
data <Type> = <Constructor> <Type1> ⋯ <TypeN>
deriving <Classes>
where the type of the data constructor is
<Constructor> :: <Type1> -> ⋯ -> <TypeN> -> <Type>
Such data type definitions are called product types, just like a tuple, they combine values of multiple types into a new compound value. In order for the deriving
clause to be valid, all parameter types <Type1>
to <TypeN>
need to be members of all the type classes listed in <Classes>
.
Analogous to Point
, we introduce a corresponding definition for vectors:
data Vector = Vector Float Float
deriving (Show, Eq)
Now, we can no longer accidentally confuse expressions that build a vector, such as Vector 1 1
, with those that construct a point, such as Point 1 1
. Moreover, they have got distinct types, which enables the type checker to keep points and vectors apart as well. If we want to convert one into the other, we need an explicit casting function that performs the conversion:
pointToVector :: Point -> Vector
pointToVector (Point x y) = Vector x y
Pattern matching decomposes the function argument of type Point
into its two components x
and y
, corresponding to the two arguments of the data constructor.
Now, we are in a position define a type safe version of the function movePointN
, using the new definitions of Point
and Vector
:
movePointN :: Float -> Vector -> Point -> Point
movePointN n (Vector vx vy) (Point x y)
= Point (n * vx + x) (n * vy + y)
We now have
movePointN 5 (Vector 1 3) (Point 0 0)
⇒Point (5 * 1 + 0) (5 * 3 + 0)
⇒Point 5 15
If we accidentally confuse the order of the parameters, the type checker will complain as vectors and points are now two distinct types.
Finally, we can also redefine Colour
as a product type instead of as a type synonym for a quadruple (4-tuple):
data Colour = Colour Int Int Int Int -- red, green, blue, and opacity component
deriving (Show, Eq)
We of course also need to adapt the definition of the pre-defined colour, such as:
red :: Colour
red = Colour 255 0 0 255
Now that we improved the type safety of the simple interface for line graphics from Spirals, Snowflakes & Trees: Recursion in Pictures, we are ready to take the next step and to extent its functionality. So far, all pictures were composed of simple paths (i.e., sequences of points connected by coloured lines). Next, we will enrich the representation of pictures to include closed shapes, such as circles, ellipses, and polygons. In addition to a customisable line colour, we will add line styles, and for closed shapes, also a fill style.
To this end, we introduce a new data type PictureObject
. A value of type PictureObject
, can be either a path, a circle, an ellipse, or a polygon. A simple enumeration type is not sufficient, though; in contrast to Day
and LineStyle
, where the name of the data constructor was all the information needed to characterise a variant, picture objects are more demanding. For example, for a path, we need to maintain the list of points that constitute the path, along with a line style and colour. As in the case of Point
and Vector
, we need a product type; i.e., the data constructor Path
requires parameters that describe the characteristics of the represented path. More precisely, we want the data constructor Path
to have the following type:
Path :: [Point] -> Colour -> LineStyle -> PictureObject
In other words, to construct a value of type PictureObject
with the data constructor Path
, we need to apply the latter to a list of points, a colour, and a line style. For example, we can now define a path object as follows
myPath :: PictureObject
myPath = Path [Point 210 200, Point 270 200, Point 545 600,
Point 525 600, Point 380 390, Point 250 600,
Point 230 600, Point 370 380, Point 260 215,
Point 210 215] red Solid
If we render that path, we get the picture in the following display.
How about circles? A circle is usually characterised by its center point as well as its radius. For our purposes, we also want to fix a line style and a fill style, where we use the following enumeration type for the latter:
data FillStyle
= NoFill
| SolidFill
deriving (Eq, Show)
Overall, we want the Circle
constructor to have the type
Circle :: Point -> Float -> Colour -> LineStyle -> FillStyle -> PictureObject
where the line style has no effect whenever the fill style is Solid
. Equipped with this data constructor, we can define three different circles as follows:
dashedCircle, dottedCircle, solidCircle :: PictureObject
dashedCircle = Circle (Point 400 400) 180 blue Dashed NoFill
dottedCircle = Circle (Point 400 400) 90 green Dotted NoFill
solidCircle = Circle (Point 400 400) 20 red Solid SolidFill
If we render all three into a single picture, we get
In a similar manner, we characterise an ellipse by its center point, its length and width, and the rotation of the x-axis of the ellipse, augmented by a colour, a line style, and fill style. Finally, a polygon, like a path, is characterised by a list of points for the polygon edges, but unlike a path, it has a fill style.
To combine the Path
, Circle
, Ellipse
, and Polygon
constructors into a single data type definition for PictureObject
, we combine the notation for enumeration types with that for product types. More precisely, we separate the alternative data constructors by a vertical bar |
and add the types of the data constructor parameters as arguments, which gets us
data PictureObject
= Path [Point] Colour LineStyle
| Circle Point Float Colour LineStyle FillStyle
| Ellipse Point Float Float Float Colour LineStyle FillStyle
| Polygon [Point] Colour LineStyle FillStyle
deriving (Show, Eq)
As before, we derive the type classes Show
and Eq
, which requires that the type parameters of all data constructors in that data definition are members of both type classes. Enumeration types where some or all of the data constructors are parameterised are more generally called sum types, or, put differently, conventional enumeration types are sum types where no data constructor has got any parameters.
Data type definitions in Haskell are generally speaking sum types (the various alternative data constructors) of product types (the multiple arguments of data constructors). As we will discuss in detail in a subsequent chapter, these definitions can also be recursive and they may be parameterised. Data types of this nature are also known as algebraic data types.
The following three definitions make use of the Ellipse
constructor:
redEllipse, greenEllipse, blueEllipse :: PictureObject
redEllipse = Ellipse (Point 400 400) 300 100 0 red Solid SolidFill
greenEllipse = Ellipse (Point 400 400) 300 100 (pi/4) green Solid SolidFill
blueEllipse = Ellipse (Point 400 400) 300 100 (pi/2) blue Solid SolidFill
In the following display, we render these three types of ellipses into individual images.
To produce more interesting pictures, we programmatically create picture objects and combine them into a list to obtain a composite picture:
type Picture = [PictureObject]
For example, to generate a sequence of n
ellipses, which are identical except for their rotation around their center, we start with a list of angles generated from a range:
[0, pi/n..(n-1) * pi/n]
We turn this into an ellipse per angle using the Prelude
function map
from Higher-order Functions:
simpleEllipsePic :: Float -> Picture
simpleEllipsePic n = map greenEllipse [0, pi/n..(n-1) * pi/n]
where
greenEllipse angle = Ellipse (Point 400 400) 250 70 angle myGreen Dashed SolidFill
myGreen = Colour 27 230 34 80
Alternatively, for a more compact definition, we can use a lambda abstraction:
simpleEllipsePic :: Float -> Picture
simpleEllipsePic n
= map (\angle -> Ellipse (Point 400 400) 250 70 angle myGreen Dashed SolidFill)
[0, pi/n..(n-1) * pi/n]
where
myGreen = Colour 27 230 34 80
By applying simpleEllipsePic
to the values 8, 12, and 60, respectively, we get the following images.
The following screencast illustrates picture generation by way of a few examples.
In addition to generating picture objects, we can transform them; for example, by translating a picture object along a vector.
movePictureObject :: Vector -> PictureObject -> PictureObject
For each variant of the data type PictureObject
, the function has to do something different. For instance, to move a path, every point in the path has to be translated by the given vector. We do this by defining a helper function that reuses the function that we defined before — the definition is compact due to its use of partial application:
movePoint :: Vector -> Point -> Point
movePoint = movePointN 1
We map
this function, partially applied to vector
, over the points in path that we want to move:
movePictureObject vector (Path points colour lineStyle)
= Path (map (movePoint vector) points) colour lineStyle
For circles and ellipses, only the center point has to be moved:
movePictureObject vector (Circle center radius colour lineStyle fillStyle)
= Circle (movePoint vector center) radius colour lineStyle fillStyle
movePictureObject vector (Ellipse center width height rotation colour lineStyle fillStyle)
= (Ellipse (movePoint vector center) width height rotation colour lineStyle fillStyle)
Finally, the code for a polygon is almost identical to that for a path:
movePictureObject vector (Polygon points colour lineStyle fillStyle)
= Polygon (map (movePoint vector) points) colour lineStyle fillStyle
To illustrate the behaviour of movePictureObject
graphically, we plot a few different shapes along the path described by sine and cosine curves:
curves :: Picture
curves
= map makeCircleSin xvals ++
map makePolygonCos xvals ++
map makeEllipseSin xvals ++
map makePathCos xvals
where
redCircle = Circle (Point 20 380) 10 red Solid SolidFill
greenEllipse = Ellipse (Point 20 380) 5 30 0 green Solid SolidFill
whitePolygon = Polygon [Point 20 380, Point 40 380, Point 40 420] white Solid SolidFill
bluePath = Path [Point 20 380, Point 40 380, Point 40 420] blue Solid
xvals = [0,10..780]
--
makeCircleSin x = movePictureObject (Vector x $ 100 * sin (pi * x/200)) redCircle
makePolygonCos x = movePictureObject (Vector x $ 100 * cos (pi * x/200)) whitePolygon
makeEllipseSin x = movePictureObject (Vector x $ 250 * sin (pi * x/200)) greenEllipse
makePathCos x = movePictureObject (Vector x $ 200 * cos (pi * x/200)) bluePath
In this function, we define the shapes using a position on the left hand side of the picture. The list xvals
contains the various distances by which each shape is to be moved along the x-axis; these are [0, 10, 20, 30, ... 780]
. The function makeCircleSin
, given such an x-distance yields a circle shape identical to the initial one, but moved to the left by x
and up (or down) by a value calculated by a scaled sine function. We do the same for the other shapes, but vary the scaling factor and whether we use the sine or cosine. The resulting image is the following.
To work through the examples in this chapter, please download the Haskell for Mac project with the PictureObject
definition and an implementation of drawPicture
: ShapeGraphics.hsproj. This project includes two Haskell modules: ShapeGraphics
and ShapeExamples
. The former contains the graphics definitions for you to use. The latter is where you can add code while working through this chapter. (We already included some of the definitions.)
Just open ShapeGraphics.hsproj
in Haskell for Mac and proceed as in the screencasts.
Run GHCi inside the ShapeGraphics.hsproj
directory after downloading and unpacking. Then, load the ShapeExamples
module. To visualise your drawings, you need to write them to disk as a PNG image file. To do so, use the writePng function as follows:
writePng FILENAME (drawPicture LINEWIDTH PICTURE)
By adopting user-defined data types for the graphics interface, we improved type safety and made the interface more expressive than the original line graphics from Spirals, Snowflakes & Trees: Recursion in Pictures. However, there is still room for improvement. For example, the definition of Colour
as a product type improved type safety over our original use of a type synonym. However, the type data definition still provides no indication as to which argument of the Colour
data constructor represents the, say, blue component of the colour. Moreover, as all colour components are of the same type, confusing them will not lead to a type error, but simply result in an incorrect program. To improve this situation, Haskell offers record syntax for product types.
data Colour
= Colour { redC :: Int
, greenC :: Int
, blueC :: Int
, opacityC :: Int
}
deriving (Show, Eq)
Here we associate each parameter of the data constructor Colour
with a name, which ought to indicate its meaning. Now, quite clearly, the blue component is the one labelled blueC
. We call each parameter of a data constructor using record syntax a field and its name a field label. Product types in record syntax are also called structs.
With that definition, we can create values of type Colour
either exactly as before (and ignore the field names when applying the data constructor)
red :: Colour
red = Colour 255 0 0 255
or we can use the names to clarify the meaning of each argument:
red :: Colour
red = Colour{redC = 255, opacityC = 255, blueC = 0, greenC = 0}
In the latter case, the fields can be initialised in any order. If we forget a field, the compiler will issue a warning —not a fatal error— and keep the field undefined.
In the same manner in which we can use field names to construct a value of record type, we can use them in pattern matching to identify data constructor arguments. Here we extract the green component of a colour:
greenComponent :: Colour -> Int
greenComponent Color{redC = red, opacityC = alpha, blueC = blue, greenC = green}
= green
In fact, we can keep this shorter: as we are only interested in the green component, we can ignore the rest and write
greenComponent :: Colour -> Int
greenComponent Color{greenC = green} = green
As extracting of individual fields of a record is a very common operation, Haskell's record syntax actually provides these projection functions automatically and they carry the same name as the field that they project. In other words, the definition of Colour
in record syntax implicitly defines four projection functions
redC :: Colour -> Int
greenC :: Colour -> Int
blueC :: Colour -> Int
opacityC :: Colour -> Int
Given those projections, the definition of greenComponent
becomes trivial:
greenComponent :: Colour -> Int
greenComponent = greenC
Although record projection functions are convenient, they come with a caveat. The projection functions are toplevel functions, and hence, all record field names need to be unique in the toplevel name space in which the data type is defined. For example, we cannot have a toplevel user-defined function called greenC
in the module defining Colour
; otherwise, the name of the projection function would clash with that user-defined function.
In addition to pattern matching using record syntax, we can also still pattern match using the conventional positional syntax, even if the definition of the product type does use record syntax — for example,
isOpaqueColour :: Colour -> Bool
isOpaqueColour (Colour _ _ _ opacity) = opacity == 255
We can also just use the projection function implied by the field name, as in
isOpaqueColour :: Colour -> Bool
isOpaqueColour colour = opacityC colour == 255
When choosing names for record fields, we not only need to consider the potential for name clashes with toplevel functions, but also name classes with other record field names. For example, we might be tempted to use the following definitions:
data Point
= Point { x :: Float, y :: Float}
data Vector
= Vector { x :: Float, y :: Float} -- Error: conflicting definitions of `x` and `y`
However, this leads to conflicting definitions of the two projection functions x
and y
— for example, x
would simultaneously have to be typed as x :: Point -> Float
, and as x :: Vector -> Float
. As a general rule of good style, we should avoid very short and general field names (even if they do not conflict with other field names) to avoid polluting the toplevel name space. Hence, we choose
data Point
= Point { xPoint :: Float
, yPoint :: Float}
deriving (Show, Eq)
data Vector
= Vector { xVector :: Float
, yVector :: Float}
deriving (Show, Eq)
On the basis of these definitions, we can rewrite the function movePointN
, such that it uses the field names:
movePointN :: Float -> Vector -> Point -> Point
movePointN n vector point
= Point { xPoint = n * xVector vector + xPoint point
, yPoint = n * yVector vector + yPoint point
}
However, it is important to note that we do not have to rewrite the function in this manner. The original definition of movePointN
is still valid as pattern matching using positional syntax is generally valid for product types defined using record syntax. It is generally wise to choose whichever variant leads to the clearest code.
As we saw above in the discussion of the function greenComponent
, pattern matching in record syntax leads to concise and clear code when only one or a few of the record fields are being used. However, this is of no immediate help in the common situation, where we want to produce a new value of record type by altering only one or a few of the components of the argument. For instance, recall the fade
function from Spirals, Snowflakes & Trees: Recursion in Pictures, which takes a colour and returns the same colour only slightly less opaque. We might define it as follows using record syntax:
fade :: Colour -> Colour
fade Colour{ redC = red, opacityC = alpha, blueC = blue, greenC = green }
= Colour{ redC = red
, greenC = green
, blueC = blue
, opacityC = max 0 (opacity - 10)
}
This is unnecessarily verbose as three out of four fields are simply copied from the argument to the result. To address this problem, record types support record update syntax of the form
<Expr> { <FieldName1> = <Expr1>, … <FieldNameN> = <ExprN> }
Here, <Expr>
must evaluate to a value of a record type that has fields named <FieldName1>
to <FieldNameN>
. It will then, produce a new value of the same type, where fields named <FieldName1>
to <FieldNameN>
have got the values <Expr1>
to <ExprN>
, but all other fields remain as in the value of <Expr>
. For example,
(Point 10 20){ xPoint = 0 }
⇒Point 0 20
We can use this notation to simplify the fade
function considerably:
fade :: Colour -> Colour
fade colour = colour{opacityC = max 0 (opacityC colour - 10)}
For more complicated data types, the code improvement is even more significant. Let us consider a function
setLineStyle :: PictureObject -> LineStyle -> PictureObject
that replaces the line style of the PictureObject
passed as a first argument by the line style given in the second argument. Given the current definition of PictureObject
, we will need to write code as the following
setLineStyle :: PictureObject -> LineStyle -> PictureObject
setLineStyle (Path points colour _) newLineStyle
= Path points colour newLineStyle
…
For each alternative in the definition of PictureObject
, we need an equation that copies all constructor parameters except the line style. Haskell supports record syntax not only for pure product types (i.e., data types with only one data constructor), but also for each alternative of a sum type (with multiple data constructors). Hence, we can use record syntax to improve the definition of PictureObject
:
data PictureObject
= Path
{ pointsPO :: [Point]
, colourPO :: Colour
, lineStylePO :: LineStyle
}
| Circle
{ centerPO :: Point
, radiusPO :: Float
, colourPO :: Colour
, lineStylePO :: LineStyle
, fillStylePO :: FillStyle
}
| Ellipse
{ centerPO :: Point
, widthPO :: Float
, heightPO :: Float
, rotationPO :: Float
, colourPO :: Colour
, lineStylePO :: LineStyle
, fillStylePO :: FillStyle
}
| Polygon
{ pointsPO :: [Point]
, colourPO :: Colour
, lineStylePO :: LineStyle
, fillStylePO :: FillStyle
}
deriving (Show, Eq)
We earlier discussed how all field names need to be distinct as they imply toplevel projection functions. There is an exception to this rule: if two fields belong to different data constructors of the same data definition and they have the same type, they may use the same field name. In our example, all data constructors of PictureObject
have got a field lineStylePO :: LineStyle
and all, but one data constructor have got a field fillStylePO :: FillStyle
.
This implies that a single projection function is used for all data constructors that share the field name. For example,
lineStylePO :: PictureObject -> LineStyle
obtains the line style of a picture object regardless of whether it represents a path, circle, ellipse, or polygon. It also means that the record update expression picObj{lineStylePO = newLineStyle}
sets the line style of any of the four variants that picObj :: PictureObject
might assume. Hence, we can define setLineStyle
simply as
setLineStyle :: PictureObject -> LineStyle -> PictureObject
setLineStyle picObj newLineStyle = picObj{lineStylePO = newLineStyle}
As a general rule, the more complex a data type gets (i.e., the more alternatives and more data constructor arguments there are), the more likely it is that record syntax improves code clarity. As a second example, let us rewrite the function movePictureObject
using record updates:
movePictureObject :: Vector -> PictureObject -> PictureObject
movePictureObject vec picObj@(Path _ _ _)
= picObj{pointsPO = map (movePoint vec) $ pointsPO picObj}
movePictureObject vec picObj@(Polygon _ _ _ _)
= picObj{pointsPO = map (movePoint vec) $ pointsPO picObj}
movePictureObject vec picObj
= picObj{centerPO = movePoint vec $ centerPO picObj}
The big win is again that we only need to mention the fields that change without any need to explicitly copy those that remain the same. We use @
-patterns (discussed in Spirals, Snowflakes & Trees: Recursion in Pictures) to match a particular data constructor, while binding the matched PictureObject
to the variable picObj
. The computations that need to be performed for the Path
and Polygon
data constructors as well as those for the Circle
and Ellipse
data constructors are identical. We can exploit that easily for the latter two by way of the third, catch all, equation. The Path
and Polygon
equations, however, repeat the same computation. We could use an additional helper function to avoid that duplication, but that seems hardly worth the trouble in this particular function definition.
Finally, we can even simplify the constructor matching —which ignores all arguments— by using record syntax without any fields.
movePictureObject :: Vector -> PictureObject -> PictureObject
movePictureObject vec picObj@Path{}
= picObj{pointsPO = map (movePoint vec) $ pointsPO picObj}
movePictureObject vec picObj@Polygon{}
= picObj{pointsPO = map (movePoint vec) $ pointsPO picObj}
movePictureObject vec picObj
= picObj{centerPO = movePoint vec $ centerPO picObj}
An important consideration when using projection functions is whether they are total or partial functions — a concept that we discussed in Fundamentals. In our example, PictureObject
, colourPO
and lineStylePO
are total functions. They always produce a result, regardless of the passed picture object. In contrast, pointsPO
and fillStylePO
are partial. For instance, applying pointsPO
to a Circle
results in a runtime exception.
The following screencast steps through the process of rewriting a function from using positional pattern matching to using record syntax.
Let us develop a function
rotatePictureObject :: Float -> Point -> PictureObject -> PictureObject
such that rotatePictureObject alpha pnt obj
rotates obj
clockwise by alpha
radians around the point pnt
, which represents the center of the rotation, as illustrated by the following diagram.
To implement rotation, we need to carefully consider which of the four variants of PictureObject
we are dealing with.
To rotate a path, we have to rotate each point constituting the path. Consequently, we need a helper function that rotates a single point around the rotation's center point as in the following illustration.
To simplify this task, we reuse the rotateLine
function from Spirals, Snowflakes & Trees: Recursion in Pictures. It effectively rotates the end point of the line around the start point. Using the old definition of Point
based on a type synonym, the code was the following:
-- Using old definition of 'Point' as type synonym
rotateLine :: Float -> Line -> Line
rotateLine alpha l@(p1@(x1, y1), p2@(x2, y2))
= ((x1, y1), (cos alpha * nx - sin alpha * ny + x1,
sin alpha * nx + cos alpha * ny + y1))
where
nx = x2 - x1
ny = y2 - y1
Adapting this code to the use the data type-based definition of Point
, we get
rotatePoint :: Float -> Point -> Point -> Point
rotatePoint alpha (Point x0 y0) (Point x y)
= Point (cos alpha * nx - sin alpha * ny + x0)
(sin alpha * nx + cos alpha * ny + y0)
where
nx = x - x0
ny = y - y0
We can easily map
this function over the points in the path, to rotate them all in the same manner.
rotatePictureObject :: Float -> Point -> PictureObject -> PictureObject
rotatePictureObject angle point picObj@Path{}
= picObj{ pointsPO = map (rotatePoint angle point) $ pointsPO pic }
In an analogous definition, we rotate the points constituting a polygon (which is essentially a closed path):
rotatePictureObject angle point picObj@Polygon{}
= picObj{ pointsPO = map (rotatePoint angle point) $ pointsPO picObj }
Rotating a circle is easier — we simply rotate its center point around the rotation center:
rotatePictureObject angle point picObj@Circle{}
= picObj{ centerPO = rotatePoint angle point $ centerPO picObj }
However, an ellipse presents an additional challenge as we also need to add the rotation factor to the rotation of the x-axis that is part of characterising an ellipse. If we add the equations for all four shapes, we get the following definition, where we combine the cases for Path
and Polygon
much as in the definition of movePictureObject
.
rotatePictureObject :: Float -> Point -> PictureObject -> PictureObject
rotatePictureObject angle point picObj@Circle{}
= picObj{ centerPO = rotatePoint angle point $ centerPO picObj }
rotatePictureObject angle point picObj@Ellipse{}
= picObj{ centerPO = rotatePoint angle point $ centerPO picObj
, rotationPO = rotationPO picObj + angle
}
rotatePictureObject angle point picObj
= picObj{ pointsPO = map (rotatePoint angle point) $ pointsPO pic }
We use the rotation function in the following definition that combines multiple instances of the same magenta polygon rotated by varying degrees around a common center.
magentaPoly :: PictureObject
magentaPoly
= Polygon [Point 430 400, Point 500 420, Point 680 400, Point 500 380] magenta Solid SolidFill
where
magenta = Colour{redC = 153, blueC = 153, greenC = 0, opacityC = 130}
rotatePic :: Int -> PictureObject -> Picture
rotatePic n shape = map (\n -> rotatePictureObject (0.2 * fromIntegral n) (Point 400 400) shape) [0..n]
By combining rotation with object translation, we can generate more dynamic pictures.
dynamicRotate:: PictureObject -> Point -> Vector -> Float -> Int -> Picture
dynamicRotate picObj _ _ _ 0 = [picObj]
dynamicRotate picObj pnt vec alpha n
= rotatePicObj : dynamicRotate (movePictureObject vec rotatePicObj)(movePoint vec pnt) vec alpha (n-1)
where
rotatePicObj = rotatePictureObject alpha pnt picObj
Data constructors and functions are somewhat similar. Both are applied to arguments, and then, produce a new entity. In fact, both share the same types. For example, we earlier discussed the type of Circle
:
Circle :: Point -> Float -> Colour -> LineStyle -> FillStyle -> PictureObject
Hence, we may regard data constructors as a special kind of function, a function that wraps its arguments up, such that they can be unwrapped again by pattern matching. This is in contrast to other functions, where we cannot recover the arguments after the function has been applied. For example, if we know that the application of the function (+)
resulted in the Int
value 10
, we can't undo this application and determine the arguments. Hence, data constructors are sometimes called free or uninterpreted functions — they merely store, but do not process their arguments.
Moreover, as data type definitions consist of a fixed number of data constructors, we (and the compiler) can infer from the type of an expression a set of patterns that will exhaustively cover all possible values that the expression may evaluate to. In fact, the Haskell system uses that information to generate a warning where patterns are not exhaustive, as this often constitutes an oversight that likely leads to incorrect program behaviour. For example, in the case of a PictureObject
, we know the only way to produce a value of this type is as the result of the application of either Path
, Circle
, Ellipse
, or Polygon
. We can always determine which one it was, and what the arguments were, either by pattern matching in a function equation or by using a case
expression:
showComponents :: PictureObject -> String
showComponents pic
= case pic of
Path points colour lineStyle
-> "The value is the result of the application of `Path` to " ++
show points ++
" and " ++
show colour ++
" and " ++
show linestyle
Circle ...
As data constructors are just a special form of function, we can also partially apply data constructors: the function zeroX
is defined in terms of a partial application of the Point
constructor, which has type Float -> Float -> Point
, to only the first argument:
zeroX :: Float -> Point
zeroX = Point 0
Applying zeroX
to 5
yields, as expected, Point 0 5
. Pattern matching can only be used on a data constructor once it has been applied to all its arguments — that is, we cannot use pattern matching on zeroX
to extract the 0
without first applying zeroX
to a second argument.
The similarity between data constructors and functions is the foundation for an alternative syntax for declaring algebraic data types, often called the generalised syntax for algebraic data types, as it enables declarations that cannot be expressed with the standard syntax. We defer discussing the new capabilities enabled by the generalised syntax to a subsequent chapter of this tutorial, as it is an extensive topic in itself. However, we outline the basic notation here, as it nicely illustrates the notion of data constructors as functions. Specifically, the generalised syntax simply consists out of declaring data constructors by way of their function signature and is identified by using the where
keyword instead of an equality symbol (=
) after the declaration head. For example,
data Point = Point Float Float
becomes
data Point where
Point :: Float -> Float -> Point
Similarly, the initial version of the PictureObject
data type can be defined as follows with the generalised syntax:
data PictureObject where
Path :: [Point] -> Colour -> LineStyle -> PictureObject
Circle :: Point -> Float -> Colour -> LineStyle -> FillStyle -> PictureObject
Ellipse :: Point -> Float -> Float -> Float -> Colour -> LineStyle -> FillStyle -> PictureObject
Polygon :: [Point] -> Colour -> LineStyle -> FillStyle -> PictureObject
To declare records with the generalised syntax, we put the record fields in braces, much like in the standard syntax. For example, we have
data PictureObject where
Path :: { pointsPO :: [Point]
, colourPO :: Colour
, lineStylePO :: LineStyle
} -> PictureObject
⋯
The generalised syntax for algebraic data types (also known as GADT syntax) is available as an extension to the standard Haskell language. To use that extension, we need to put the following language pragma at the top of the Haskell module or playground, where we want to use it:
{-# LANGUAGE GADTSyntax #-}
We already saw that the record notation permits some freedom in how to decompose records. For instance, we had the example
greenComponent :: Colour -> Int
greenComponent Color{greenC = green} = green
where we bind the value of the record field greenC
to the local variable green
. We also saw that we can use projection functions instead:
greenComponent :: Colour -> Int
greenComponent colour = greenC colour
Even more compact notation is enabled by a language extension known as record punning, which we can use if we put the pragma
{-# LANGUAGE NamedFieldPuns #-}
at the top of a module or playground. Punning introduces a shorthand for binding record fields to local variables having the same name as the record field. For example,
greenComponent :: Colour -> Int
greenComponent Color{greenC} = greenC -- with punning
is the same as
greenComponent :: Colour -> Int
greenComponent Color{greenC = greenC} = greenC -- without punning
We can also use punning in the construction of records, as in
red = Colour{redC, opacityC, blueC, greenC}
where
redC = 255
opacityC = 255
blueC = 0
greenC = 0
Finally, record wildcards, enabled by {-# LANGUAGE RecordWildCards #-}
, take this idea one step further. Instead of enumerating the field names, we can use ellipsis to enable punning for all field names that are not explicitly mentioned. This enables us to write the previous example as
red = Colour{..}
where
redC = 255
opacityC = 255
blueC = 0
greenC = 0
and the projection of the green component as
greenComponent :: Colour -> Int
greenComponent Color{..} = greenC -- with wildcard punning
By explicitly mentioning some names, we can, for example, use pattern matching on selected fields, as in this example of a function that returns the opacity of any colour with a fully saturated green channel:
brightGreenOpacity :: Colour -> Int
brightGreenOpacity Color{greenC = 255, ..} = opacityC
brightGreenOpacity _ = error "green not saturated"
Rewrite the defintion of map
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x : xs) = f x : map f xs
to use case
notation — i.e., complete the following definition
map f xs = case xs of
…
Write a function which, given a day, returns the data constructor representing the following day:
nextDay :: Day -> Day
How would you define a data type to represent the different cards of a deck of poker cards? How would you represent a hand of cards?
Can you define a function which, given a hand of cards calculates its values according to the 21-rules: that is, all the cards from 2 to 10 are worth their face value. Jack, Queen, King count as 10. The Ace card is worth 11, but if this would mean the overall value of the hand exceeds 21, it is valued at 1.
Can you define a function simpleCirclePic :: Float -> Picture
, similar to simpleEllipsePic
that we defined earlier, such that it generates a picture of circles, which increase in size in dependence on the floating point argument, such that the radius of the biggest circle does not exceed 400
. The generated pictures should look like those in the following display.
Sierpinski triangles are another well-know fractal pattern. Can you define a function
sierpinski :: Int -> Picture
which, when applied to 0 to 5, produces the following pictures?
Can you adapt the colouredFTree
tree function from Higher-order Functions to the new point, line, colour, and picture types, such that they produce trees consisting of coloured polygons (specifically, pentagons formed from the base square and the triangle used to construct the tree components) such that the function can produce the trees displayed below?
As another variation on tree generation, change the algorithm to produce rectangles of various length-width ratios instead of just squares as the basis for polygon construction. This leads to somewhat more organic looking shapes as those in the next set of images.