The Curious Programmer

Software, Gadgets, Books, and All Things Geek

How do Computers See? — June 19, 2017

How do Computers See?

(This is part 3 in a series of posts on artificial intelligence and deep learning/neural networks. You can check out part 1 and part 2 if you haven’t yet read them and are new to AI)

There was a time when artificial intelligence was only home to our most creative imaginations. Yet, isn’t that where technology is born? In our imaginative minds? Though it is tempting to simply jump right into the technological advances that are the driving forces behind AI, we must first take a trip back in time and gander at how far we have come since Samuel Butler first wrote in 1906,

“There is no security against the ultimate development of mechanical consciousness, in the fact of machines possessing little consciousness now. A jellyfish has not much consciousness. Reflect upon the extraordinary advance which machines have made during the last few hundred years, and note how slowly the animal and vegetable kingdoms are advancing. The more highly organized machines are creatures not so much of yesterday, as of the last five minutes, so to speak, in comparison with past time.”

Since the first play written by Karel Capek in 1920, which depicted a race of self-replicating robot slaves who rose up and revolted against their human masters, to the most recent Star Trek character named Data, humans have always imagined the day machines would become intelligent.

Today, not only is AI a reality, but it is changing the very way we live and work. From AI in autonomous vehicles, which allow them to locate each other, to Google’s AI Voice Assistant, we are unwittingly surrounded by artificial intelligence. The question most ask is, “How does it all work?”

I could not answer that in one article. I will, however, try to cover a small subset of AI today that has given computers an ability most humans take for granted, but would greatly miss if it were taken away…the power of sight!

The Problem

Why has recognizing an image been so hard for computers and so easy for humans? The answer boils down to the algorithms used for both. Algorithms? Wait, our brains don’t have algorithms, do they??

I, and many others, do believe our brains have algorithms…a set of laws (physics) that are followed, which allow our brain to take data from our senses and transform it into something our consciousness can classify and understand.

Computer algorithms for vision have been nowhere near as sophisticated as our biological algorithms. That is until now.

Artificial Neural Networks Applied to Vision

(If you haven’t been introduced to neural networks yet, please check out this post first to get a quick introduction to the amazing world of ANNs)

Artificial neural networks (ANNs) have been around for awhile now, but recently a particular type of ANN has broken records for computer vision competitions and changed what we thought was possible in this problem space. We call this type of ANN a convolutional neural network.

Convolutional Neural Networks

Convolutional neural networks, also known as ConvNets or CNNs, are among the most effective computational models for performing certain tasks, such as pattern recognition. Yet, despite their importance to aspiring developers, many struggle with understanding just what CNNs are and how they work. To penetrate the mystery, we will work with the common application of CNNs to computer vision, which begins with a matrix of pixels. Then we’ll go layer by layer, and operation by operation, through the CNN’s deep structure, finally arriving at its output: the identification of a cloud, cat, tree, or whatever the CNNs best guess is about what it’s witnessing.

High-Level Architecture of a CNN

CNNArchitecture

source: ResearchGate.com

Here you can see the conceptual architecture of a typical (simple) CNN. To come up with a reasonable interpretation of what it’s witnessing, a CNN performs four essential operations, each corresponding to a type of layer found in its network.

These four essential operations (illustrated above) in a CNN are:

  1. The Convolution Layer
  2. The ReLU activation function
  3. Pooling/subsampling Layer
  4.  Fully Connected ANN (Classification Layer)

The input is passed through each of these layers and will be classified in the output. Now let’s dig a little bit deeper into how each of these layers works.

The Input: A Matrix of Pixels

To keep things simple, we’ll only concern ourselves with the most common task CNNs perform: pattern or image recognition. Technically, a computer doesn’t see an image, but a matrix of pixels, each of which has three components: red, green and blue. Therefore, a 1,000-pixel image for us will have 3,000 pixels for a computer. It will then assign a value, or intensity, to each of those 3,000 pixels. The result is a matrix of 3,000 precise pixel intensities, which the computer must somehow interpret as one or more objects.

The Convolution Layer

The first key point to remember about the convolutional layer is that all of its units, or artificial neurons, are looking at distinct, but slightly overlapping, areas of the pixel matrix. Teachers and introductory texts often use the metaphor of parallel flashlight beams to help explain this idea. Suppose you have a parallel arrangement of flashlights with each of the narrow beams fixated on a different area of a large image, such as a billboard. The disk of light created by each beam on the billboard will overlap slightly with the disks immediately adjacent to it. The overall result is a grid of slightly overlapping disks of light.

featureMap

source: i.stack.imgur.com/GvsBA.jpg

The second point to remember about the convolution layer is that those units, or flashlights if you prefer, are all looking for the same pattern in their respective areas of the image. Collectively, the set of pattern-searching units in a convolutional layer is called a filter. The method the filter uses to search for a pattern is convolution.

The complete process of convolution involves some rather heavy mathematics. However, we can still understand it from a conceptual point of view, while only touching on the math in passing. To begin, every unit in a convolutional layer shares the same set of weights that it uses to recognize a specific pattern. This set of weights is generally pictured as a small, square matrix of values. The small matrix interacts with the larger pixel matrix that makes up the original image. For example, if the small matrix, technically called a convolution kernel, is a 3 x 3 matrix of weights, then it will cover a 3 x 3 array of pixels in the image. Naturally, there is a one-to-one relationship, in terms of size, between the 3 x 3 convolution kernel and the 3 x 3 section of the image it covers. With this in mind, you can easily multiply the weights in the kernel with the counterpart pixel-values in the section of the image at hand. The sum of those products, technically called the dot product, generates a single pixel value that the system assigns to that section of the new, filtered version of the image. This filtered image, known as the feature map, then serves as the input for the next layer in the ConvNet described below.

It’s important to note at this point that units in a convolutional layer of a ConvNet, unlike units in a layer of a fully-connected network, are not connected to units in their adjacent layers. Rather, a unit in a convolutional layer is only connected to the set of input units it is focused on. Here, the flashlight analogy is again useful. You can think of a unit in a convolutional layer as a flashlight that bears no relation to the flashlights ahead of it, or behind it. The flashlight is only connected to the section of the original image that it lights up.

The ReLU Activation Function

The rectified linear unit, or ReLU, performs the rectification operation on the feature map, which is the output of the convolution layer. The rectification operation introduces real-world non-linearity into the CNN in order to properly train and tune the network, using a feedback process known as back-propagation. Introducing non-linearity is important and powerful in neural networks to model problems (input parameters) that are inherently nonlinear by nature.relufamily

source: datasciencecentral.com

Above you can see three different implementations of a ReLU activation function (the most basic being just the ReLU). Different ReLUs are used in different problems to better break the linearity of input parameters most accurately.

The Pooling Layer

The more intricate the patterns the CNN searches for, the more convolution and ReLU layers are necessary. However, as layer after layer is progressively added, the computational complexity quickly becomes unwieldy.

pooling

source: wiki.tum.de

Another layer, called the pooling or subsampling layer, is now needed to keep the computational complexity from getting out of control. The pooling layer’s essential operation involves restricting the number of patterns the CNN concentrates on, isolating only the most relevant information for the purposes at hand.

The Classification Layer

Finally, the CNN requires one or more layers to classify the output of all previous layers into categories, such as cloud, cat, or tree.

The most obvious characteristic that distinguishes a classification layer from other layers in a CNN is that a classification layer is fully-connected. This means that it resembles a classic neural network (which we discussed in part 2), with the units in each layer connected to all of the units in their adjacent layers. Accordingly, classification layers often go by the name fully-connected layers, or FCs.

Depth and Complexity

Most CNNs are deep neural networks, meaning their architecture is quite complex, with dozens of layers. You might have, for example, a series of four alternating convolution and ReLU layers, followed by a pooling layer. Then this entire series of layers might, in turn, repeat several times before introducing a final series of fully-connected layers to classify the output.

Unraveling the Mystery of CNNs

Convolutional neural networks are deep, complex computational models that are ideal for performing certain tasks, such as image recognition.

carExample

source: computervisionblog.com

To understand how a CNN recognizes a pattern in an image, it’s valuable to go step by step through its operations and layers, beginning with its input: a matrix of pixel values. The first layer is the convolution layer, which uses the convolution operation to multiply a specific set of weights, the convolution kernel, by various sections of the image in order to filter for a particular pattern. The next layer is the ReLU layer, which introduces nonlinearity into the system to properly train the CNN. There may be a series of several alternations between convolution and ReLU layers before we reach the next layer, the pooling layer, which restricts the output to the most relevant patterns. The entire series of convolution, ReLU and pooling layers may, in turn, repeat several times before we reach the final classification layer. These are fully-connected layers that classify the CNNs output into likely categories, such as cloud, cat, tree, etc.

architectureEmergent

source: mdpi.com

This is just a high-level look at how a typical CNN is architected. There may be many variations that experts will use in practice to tune their network for their particular use cases. This is where the expertise comes into play. You may need to “tune” your network if the initial training does not produce as accurate of results as you had hoped. This process is called “Hyperparameter Tuning” and I will have to write another whole article just covering that. For now, familiarize yourself with the basics of ANNs and CNNs and come back soon to read about hyperparameter tuning in the near future!

As always, thanks so much for reading! Please tell me what you think or would like me to write about next in the comments. I’m open to criticism as well!

If you take the time to “like” or “share” the article, that would mean a lot to me. I write for free on my own time because I enjoy talking about technology and the more people that read my articles, the more individuals I get to geek out with!

Thanks and have a great day!

From Fiction to Reality: A Beginner’s Guide to Artificial Neural Networks — June 12, 2017

From Fiction to Reality: A Beginner’s Guide to Artificial Neural Networks

Interest in artificial intelligence is reaching new heights. 2016 was a record year for AI startups and funding, and 2017 will certainly surpass it, if it hasn’t already. According to IDC, spending on cognitive systems and AI will rise more than 750% by 2020. Both interest and investment in AI spans the full spectrum of the business and technology landscape, from the smallest startups to the largest corporations, from smartphone apps to public health safety systems. The biggest names in technology are all investing heavily in AI, while baking it into their business models and using it increasingly in their offerings: virtual assistants (e.g. Siri), computer vision, speech recognition, language translation and dozens of other applications. But what is the actual IT behind AI? In short: artificial neural networks (ANN). Here we take a look at what they are, how they work, and how they relate to the biological neural networks that inspired them.

Defining an Artificial Neural Network

The term artificial neural network is used either to refer to a mathematical model or to an actual program that mimics the essential computational features found in the neural networks of the brain.

The Neuron

neuron_anatomy
source: http://www.robots.ox.ac.uk

Although biological neurons are extremely complicated cells, their essential computational nature in terms of inputs and outputs is relatively straightforward. Each neuron has multiple dendrites and a single axon. The neuron receives its inputs from its dendrites and transmits its output through its axon. Both inputs and outputs take the form of electrical impulses. The neuron sums up its inputs, and if the total electrical impulse strength exceeds the neuron’s firing threshold, the neuron fires off a new impulse along its single axon. The axon, in turn, distributes the signal along its branching synapses which collectively reach thousands of neighboring neurons.

Biological vs Artificial Neurons

biologicalVsArtificialNeuron
source: DataCamp

There are a few basic similarities between neurons and transistors. They both serve as the basic unit of information processing in their respective domains; they both have inputs and outputs; and they both connect with their neighbors. However, there are drastic differences between neurons and transistors as well. Transistors are simple logic gates generally connected to no more than three other transistors. Neurons, by contrast, are highly complex organic structures connected to roughly 10,000 other neurons. Naturally, this rich network of connections gives neurons an enormous advantage over transistors when it comes to performing cognitive feats that require thousands of parallel connections. For decades, engineers and developers have envisioned ways to capitalize on this advantage by making computers and applications operate more like brains. Finally, their ideas have made their way into the mainstream. Although transistors themselves will not look like neurons anytime soon, some of the AI software they run can now mimic basic neural processing, and it’s only getting more sophisticated.

Modeling the Neuron

The perceptron

neuron_model
source: http://cs231n.github.io/neural-networks-1/

The perceptron, or single-layer neural network, is the simplest model of neural computation, and is the ideal starting point to build upon. You can think of a perceptron as a single neuron. However, rather than having dendrites, the perceptron simply has inputs: x1, x2, x3,…,xN. Moreover, rather than having an axon, the perceptron simply has a single output: y = f(x).

Weights

Each of the perceptron’s inputs (x1, x2, x3,…,xN) has a weight (w1,w2,w3,…,wN). If a particular weight is less than 1, it will weaken its input. If it’s greater than 1, it will amplify it. In a slightly more complex, but widely-adopted, model of the perceptron, there is also an Input 1, with fixed weight b, which is called the bias and serves as the target value used for training the perceptron.

The activation function

Also called a transfer function, the activation function determines the value of the perceptron’s output. The simplest form of activation function is a certain type of step function. It mimics the biological neuron firing upon reaching its firing threshold by outputting a 1 if the total input exceeds a given threshold quantity, and outputting a 0 otherwise. However, for a more realistic result, one needs to use a non-linear activation function. One of the most commonly used is the sigmoid function:

f(x)= 1 / (1+e^-x)

There are many variations on this basic formula that are in common use. However, all sigmoid functions will adopt some form of S-curve when plotted on a graph. When the inputs are zero, the output is zero. As the input values become positive, however, the output initially increases (roughly) exponentially, but eventually maxes out at a fixed value represented by a horizontal asymptote. This maximum output value reflects the maximum electrical impulse strength that a biological neuron can generate.

Adding Hidden Layers

In more complex, realistic neural models, there are at least three layers of units: an input layer, an output layer, and one or more hidden layers. The input layer receives the raw data from the external world that the system is trying to interpret, understanding, perceive, learn, remember, recognize, translate and so on. The output layer, by contrast, transmits the network’s final, processed response. The hidden layers that reside between the input and the output layers, however, serve as the key to the machine learning that drives the most advanced artificial neural networks.

Most modeling assumes that the respective layers are fully connected. In a fully connected neural network, all the units in one layer are connected to all the units in their neighboring layers.

 

Backpropagation

You can think of backpropagation as the process in neural networks that allow the network to learn. During backpropagation the network is in a continual process of training, learning, adjusting and fine-tuning itself until it gets closer to the intended output. Backpropagation optimizes by comparing the intended output to the actual output using a loss function. The result is an error value, or cost, which backpropagation uses to re-calibrate the networks weights between neurons (to find the most relevant features and inputs that result in the desired output), usually with the help of the well-known gradient descent optimization algorithm. If there are hidden layers, then the algorithm re-calibrates the weights of all the hidden connections as well. After each round of re-calibration, the system runs again. As error rates get smaller, each round of re-calibration becomes more refined. This process may need to repeat thousands of times before the output of the backpropagation network closely matches the intended output. At this point, one can say that the network is fully trained.

Thinking It Through

With AI investment and development reaching new heights, this is an exciting time for AI enthusiasts and aspiring developers. However, it’s important to first take a good look at the IT behind AI: artificial neural networks (ANN). These computational models mimic the essential computational features found in biological neural networks. Neurons become perceptrons or simply units; dendrites become inputs; axons become outputs; electrical impulse strengths become connection weights; the neuron’s firing strength function becomes the unit’s activation function; and layers of neurons become layers of fully-connected units. Putting it all together, you can run your fully-trained feed-forward network as-is, or you can train and optimize your backpropagation network to reach a desired value. Soon you’ll be well on your way to your first image recognizer, natural language processor, or whatever new AI app you dream up.

Thanks for reading, and if you liked this please share this post or subscribe to my blog at JasonRoell.com or follow me on LinkedIn where I post about technology topics that I think are interesting for the general programmer or even technology enthusiast to know.

Have a great day and keep on learning!

The AI Winter is Over. Here’s Why. — February 16, 2017

The AI Winter is Over. Here’s Why.

Unless you’re living under a rock, you’ve probably noticed Artificial Intelligence (AI) is popping up more and more in technology talks and business strategies. I’ve even noticed among my friends an increased interest in “cognifying” their applications.

It’s easy to see why. Everyone is aware of the autonomous car revolution, and, if you are in the loop, you know it’s due largely to the advancements in AI, particularly “Machine Learning,” which is a strategy used to implement AI in software applications or robots running software. More on this later.

Let’s first step back and discuss AI What does it mean for a machine to possess artificial intelligence?

At its core, it’s a simple idea. Artificial Intelligence is the broader concept of machines carrying out tasks in a way that we consider “smart.”

And,

Machine Learning is a current application of AI based upon the idea that we should give machines access to data and let them learn for themselves.

Some have heard of the term “AI” used before and even have some experience with AI applications as far back as the 90’s and 2000’s. Most people’s familiarity with AI is thanks to gaming. It is common to play an AI in a video game when you don’t have another person to play with. Other AI applications with which users are familiar include tools like spell checkers and other helpful systems that seem partially smart in helping humans complete a task using well-defined rules. However, many of these “older AIs” were developed and implemented in what we call an “Expert System”. This means that to codify the intelligence into the program or software, we would need an expert, such as a linguistic expert when talking about spell check or a medical expert when talking about systems that help doctors diagnose patients. This type of AI system was very popular in the 80’s and 90’s.

Unfortunately, there was a problem with these types of AI expert systems. As anyone who has used an AI that was implemented with an expert system approach can attest, these systems constantly made mistakes when dealing with uncommon scenarios or in situations where even the expert was not well versed. The AI was useless in these situations, and fixing these systems called for reprograming them with new expert information. Another drawback to expert systems is that they are very costly to build. It requires finding for each particular domain an expert who can articulate to programmers the intricacies of their field and when and why any given decision should be made. These types of decisions are hard to codify when writing a deterministic algorithm (a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states).

For these reasons, artificial intelligence researchers needed to invent a better way to give “smarts” to a machine.

This is where Machine Learning (ML) comes into play. Many people are surprised to learn that ML is actually a relatively old topic when it comes to AI research. Researchers understood back in the 80s that expert systems were never going to create an AI that would be capable of driving our cars or beating the best humans in chess or Jeopardy. That is because the parameters of problems such as these are too varied, change over time, and have many different weights based on different states of the applications. Moreover, there are many attributes to a problem that cannot be directly observed and thus cannot be directly programmed into the application logic.

Machine Learning addresses this problem by developing a program capable of learning and decision making based upon accessible data, similar to human cognition, instead of programming the machine to perform a deterministic task. So, instead of making a deterministic decision, the program relies on probability and a probability threshold to decide if it “knows” or “doesn’t know” something. This is how the human brain makes decisions in the complex world in which we live.

For example, if you see a flower that looks like a rose and you say, “Yep that is a rose,” what you are really saying is, “Yep, based on my previous knowledge (data) on what I believe a rose to look like and what I’ve been told a rose looks like, I am 97 percent sure that this is a rose.”

On the other hand, if you saw an iris flower, and you haven’t seen many before or don’t have many representations in your memory as to what an iris looks like, you might say, “I’m not sure what that flower is.” Your “confidence interval” is below the threshold of what you believe is acceptable for identifying the flower as an iris (let’s call that 30 percent sure that it was an iris flower.)

Making decisions based on probability is what our brains do best. If we can model a computer program in the same conceptual way, then the implications for AI is potentially unlimited!

Okay, so the question you should ask now is, “If we knew all this in the 70s and 80s then why is ML just now becoming popular?”

My answer to that is analogous to the scientific community just now verifying the existence of gravitational waves that Einstein proposed in theoretical physics so many years ago. We just didn’t have the machinery or tools to validate or invalidate his theory, it was ahead of its time and its practical use had to wait.

Even though we’ve understood the mathematical model around machine learning for many years, the infrastructure, technology, and data needed to make machine learning a reality (as opposed to a theory) were not available in the 70s and 80s. However, that is no longer the case, and the “AI Winter” may soon be over.

Three basic “raw materials” are necessary to create a practical machine learning application. Let’s talk about them.

1. Cheap Parallel Computation

Thinking is not a synchronous process. At its core, thinking is just pattern recognition. We are constantly seeing, feeling, hearing, tasting, or smelling patterns that activate different clusters of neurons. Millions of these “pattern recognizing neurons” communicate low-level patterns to the next neural network. This process continues until we reach the highest conceptual layers and most abstract patterns.

Each neuron can be thought of as an individual pattern recognition unit, and based on the input it gets from other neurons (recognized patterns) we can eventually make high-level decisions. We would like to model computer programs in a similar way. Modeling computer programs in a way that resembles the biological structure of the brain is called an “artificial neural network.” How fitting!

Parallelization of this type of data processing is obviously vital to building a system that simulates human thought. That type of power was not available in the 80’s and just recently became cheap enough for it to be practical in machine learning solutions.

Why now?

One word. GPUs.

Okay, maybe that’s three words? Graphical Processing Units (GPUs) became popular because of their ability to process high graphics for video games, consoles, and then even cell phones. Graphics processing is inherently parallel, and these GPUs were architected in a way to take advantage of this type of computing. As GPUs became popular, they also became cheaper because companies competed against each to drive down prices of GPUs. It didn’t take long to realize that GPUs might solve the computation problem that had been stumping researchers for decades. This could give them the necessary parallel computation that is required to build an artificial neural network. They were right, and this lowering of costs of GPUs enabled companies to buy massive amounts of them and use them in building machine learning platforms. This will greatly accelerate what we are able to build with highly parallelized neural networks and the amount of data we are able to process. Speaking of data…

2. Big Data

Big Data this, Big Data that…

I know everyone has heard all of the hype about big data. What does it really mean, though? Why is it here now? Where was it before? How big is “big”??

Well, the truth of the matter is that when we talk about big data, we’re really saying that we’re capturing, processing, and generating more data every year and this data is growing exponentially.

The reason this is a big deal is that in order to train an artificial brain to learn for itself, you need a MASSIVE amount of data. The amount of visual data alone that a baby takes in and processes each year is more data than data centers had in the 80s. That’s not even enough to train machines, though. That’s because we don’t want to wait a year to learn elementary vision! To train computers in artificial vision we need more data than some people can absorb in a lifetime. That has finally become possible because storing and recording so much data is cheap, fast, and everywhere and generated by everything!

The amount of data on our smartphone holds more data than most giant computers systems in the 80s. Data and memory to store that it has grown to epic proportions and has no indication of slowing down anytime soon. This data is crucial for the implementation of smart machines as it takes many instances of a problem to infer a probabilistically correct solution. Big data is the knowledge base from which these computers need to learn. All the knowledge to which an AI has access is the result of us collecting and feeding the AI more and more data and letting the machine learn from the underlying patterns in the data.

The power of AI comes when computers start recognizing patterns never seen to human practitioners. The machine will understand the data and recognize patterns in that data the same way our neurons begin to recognize certain patterns to problems that we have seen before. The advantage the machine has over us is the electronic signaling through circuitry, which is much faster than our biological chemical signaling over synapses in our brain. Without big data, our machines would have nothing from which to learn. The larger the data set, the smarter the AI will become and the quicker the machines will learn!

3. Better/Deep Algorithms

As I have alluded before, researchers invented artificial neural nets in the 1950s, but the problem was that even if they had the computing power, they still didn’t have efficient algorithms to process these neural nets. There were just too many astronomically huge combinatorial relationships between a million—or a hundred million— neurons. Recently that has all changed. Breakthroughs in the algorithms involved in this process have lead to new types of artificial networks. Layered Networks.

For example, take the relatively simple task of recognizing that a face is a face. When a group of bits in a neural net is found to trigger a pattern—the image of an eye, for instance—that result (“It’s an eye!”) is moved up to another level in the neural net for further parsing. The next level might group two eyes together and pass that meaningful chunk on to another level of hierarchical structure that associates it with the pattern of a nose. It can take many millions of these nodes (each one producing a calculation feeding others around it), stacked up many levels high, to recognize a human face. In 2006, Geoff Hinton, then at the University of Toronto, made a key tweak to this method, which he dubbed “deep learning.” He was able to mathematically optimize results from each layer so that the learning accumulated faster as it proceeded up the stack of layers. Deep learning algorithms accelerated enormously a few years later when they were ported to GPUs. The code of deep learning alone is insufficient to generate complex logical thinking, but it is an essential component of all current AIs, including IBM’s Watson; DeepMind, Google’s search engine; and Facebook’s algorithms.

This perfect storm of cheap parallel computation, bigger data, and deeper algorithms generated the 60-years-in-the-making overnight success of AI. And this convergence suggests that as long as these technological trends continue—and there’s no reason to think they won’t—AI will keep improving.

Thanks for reading, and if you liked this please subscribe to my blog at JasonRoell.com or follow me on LinkedIn where I post about technology topics that I think are interesting for the general programmer or even technology enthusiast to know.

Also, I would like to thank Kevin Kelly for my inspiration for this post. I highly recommend picking up his book “The Inevitable” in which he discusses these and many more processes in much more detail.

Sources:

The Inevitable: Understanding the 12 Technological Forces that will Shape our Future. -Kevin Kelly 2016

How to Create a Mind – Ray Kurzweil

http://www.forbes.com/sites/bernardmarr/2016/12/06/what-is-the-difference-between-artificial-intelligence-and-machine-learning/#7ed065c687cb

RxJava: A Paradigm Shift — October 21, 2016

RxJava: A Paradigm Shift

Grokking RxJava

Disclaimer: RxJava is a beast. A beautiful beast, but a beast none the less. There is a lot to it and I can’t cover it all in this post. I tried to use the Pareto Principle to cover the 20% that will give you 80% of what you really need to know. With that being said, let’s jump in!

RxJava can be complicated to wrap your head around at first (at least it was for me!). The main reason that this can be tough at first is because of the way that a lot of Java programs are written and the way that java programming is usually taught is through programming imperatively. As I’m sure you are aware, there are many different programming paradigms and two of the most popular are imperative programming and functional programming. It is first important to understand this distinction because in order to use RxJava effectively you will have to understand the difference in these paradigms.

Functional vs. Imperative: Quick Overview

The functional programming paradigm was explicitly created to support a pure functional approach to problem solving. Functional programming is a form of declarative programming. In contrast, most mainstream languages, including object-oriented programming (OOP) languages such as C#, C++, and Java, were designed to primarily support imperative (procedural) programming.

With an imperative approach, a developer writes code that describes in exacting detail the steps that the computer must take to accomplish the goal. This is sometimes referred to as algorithmic programming. In contrast, a functional approach involves composing the problem as a set of functions to be executed. You define carefully the input to each function, and what each function returns. The following table describes some of the general differences between these two approaches.

Characteristic

Imperative approach

Functional approach

Programmer focus How to perform tasks (algorithms) and how to track changes in state. What information is desired and what transformations are required.
State changes Important. Non-existent.
Order of execution Important. Low importance.
Primary flow control Loops, conditionals, and function (method) calls. Function calls, including recursion.
Primary manipulation unit Instances of structures or classes. Functions as first-class objects and data collections.

source: MSDN

Although most languages were designed to support a specific programming paradigm, many general languages are flexible enough to support multiple paradigms. For example, most languages that contain function pointers can be used to credibly support functional programming. Furthermore, RxJava includes an explicit library to support functional programming. RxJava is a form of declarative, functional programming. Some even describe it as Functional Reactive Programming (Oh great, another paradigm!). Well don’t worry, it’s very similar to functional programming but it also adds in “reactive” behavior. See the image below:

Functional Reactive Programming Basics

screenshot-from-2016-10-21-11-15-03

Fundamentally, Functional Reactive Programming (FRP) is the Observer Pattern (this is the “reactive” part), with support for manipulating and transforming the stream of data our Observables emit and doing this without side effects (this is the functional part). Observables are a pipeline (or think stream) our data will flow through.

Side Note:side effect refers simply to the modification of some kind of state – for instance:

  • Changing the value of a variable;
  • Writing some data to disk;
  • Enabling or disabling a button in the User Interface.

If you aren’t familiar with the Observer Pattern and don’t feel like reading the wiki on it, there are two main ideas that you need to understand. The Observer Pattern involves two roles: a source Observable, and one or many Observers that are interested in the events or objects that the source Observable will be emitting. The Observable emits objects, while the Observer subscribes and receives them.

Okay, now that you are an expert on FRP, I can start talking a little bit about RxJava. The simplest way to think about RxJava is that it introduces an easy library for developing functional reactive applications (great for micro-service architecture!) in the Java language. So why do we need that? Because non-blocking architecture is the best way to handle scalable applications.

screenshot-from-2016-10-21-11-07-33FRP allows us to develop applications in this way. Now going back to FRP, you will note that to have a functional reactive system, you are going to have two main components to work with. These components are (thankfully) named the same as the components in the Observable Pattern that they implement: Observable and Observer. You will also see people (docs) talking about Subscriber. However, don’t let this confuse you. A Subscriber is just an implementation of Observer, with additional semantics on subscription (it’s more about un-subscription). Typically the two terms can be interchangeable. In both cases when we talk about Observers and Subscribers, we are talking about the objects interested in receiving objects from the source Observable that they are subscribed to. From here on out I will just refer to them as Subscribers (so slightly enhanced observers that you will typically be working with anyway) to keep it a little more clear from when I am talking about an Observable (Observable and Observer are too close to the same word! It hurts my mind).

TLDR; The Observable emits items, the Subscriber consumers those items as they are emitted to them. So instead of having a collection and pulling items out, we have a publisher (observable) pushing items to us.

screenshot-from-2016-10-21-11-00-00

So how does an Observable emit (send) items to its Subscribers? There is a method to this madness. The way this is done is an Observable may emit any number of items (including zero items and infinite items), then it terminates either by successfully completing, or due to an error. For each Subscriber it has, a Observable calls Subscriber.onNext(T item) any number of times passing the Subscriber the item they are interested in, followed by either Subscriber.onComplete() or Subscriber.onError(Throwable e). These two ways are the only ways that you know that an Observable is “finished”. Note that the Subscriber.onComplete() call will only happen if the Observable is not an infinite stream. Think of an Observable that wraps a timer emitting the time every “x” milliseconds to a log.

I know what you are thinking…WE NEED EXAMPLES! Okay, okay. Well, let’s start out with the basics of just creating an Observable. There are a few ways to create an Observable depending on the source that you are converting to an Observable stream. However, all of them rely on static factory methods from the Observable class. So let’s say if I have stringOne, stringTwo, and stringThree and I want to convert these to an Observable that could, later on, be subscribed to, all I would need to do is the following:

Observable<String> interestingStrings = Observable.just("I am interesting", "Subscribers will love me", "I hope Subscribers like me");

Wow, how easy was that?!

Okay now, we have an Observable that will be emitting 3 strings to any Subscriber that is interested. Not until a Subscriber is subscribed through will anything actual happen or get emitted. That makes sense if you think about it. If I had a magazine but no subscribers, I wouldn’t be doing any work to send my magazines anywhere. There would be no one interested!

So let’s create a subscriber and subscribe to our new shiny Observable (interestingStrings).

Observable<String> interestingStrings = Observable.just("I am interesting", "Subscribers will love me", "I hope Subscribers like me"); // created Observable
// create awesome Subscriber
Subscriber<String> iPrintLines = newSubscriber<String>() {
    @Override
    publicvoidonNext(String s) { System.out.println(s); }
    @Override
    publicvoidonCompleted() { }
    @Override
    publicvoidonError(Throwable e) { }
};
interestingStrings.subscribe(iPrintLines); // Yay, someone subscribed!
 
// this will print out:
//
// I am interesting
// Subscribers will love me
// I hope Subscribers like me
//
// (not printed) OnCompleted() will then be called but since we don't have any behavior for this function, nothing will be shown.
// OnError() was not called because none of our items threw an exception. But if they did, the exception would have been passed to
// the OnError() handler in the subscriber (it is in charge of deciding what to do with this) and then the Observable would stop emitting any items after this.

An important thing to remember is that as soon as you have a subscriber, the observable will begin emitting its source objects (if it has any yet) to the subscriber by calling subscriber.OnNext(String s) and in our case, passing us those 3 strings. The subscriber will then operate on each emitted item in a sequence and perform the work that you have declared in your subscriber.OnNext function. So in our case, we take each string and print it to the console.

That was a bit verbose because I wanted you to get a good idea of how the Observable and Subscriber is actually constructed and what is actually going on. A simpler and more common approach to working with items you are interested in might look like this. In this example, the subscriber is implicit created and you are just working with the OnNext, OnError, and OnCompleted callbacks.

Observable<String> interestingStrings = Observable.just("I am interesting", "Subscribers will love me", "I hope Subscribers like me"); // creation
 
interestingStrings.subscribe(s -> System.out.println(s)); // inline subscriber that only implements the OnNext() handler.
 
interestingStrings.subscribe(s -> System.out.println(s),
                             e -> System.out.println(e.message())); // inline subscriber that implements OnNext() and OnError() handlers.
 
interestingStrings.subscribe(s -> System.out.println(s),
                             e -> System.out.println(e.message()),
                             c -> System.out.println("I'm Done!")); // inline subscriber that implements all handlers.

Okay, so we’ve worked a little bit with creating an Observable and creating some Subscribers to do something with our emitted items. However, the Observable.just() creation factory method is just one way to create an Observable from one or many items. Another factory method that you will most likely want to familiarize yourself with is the Observable.from() method. The method takes a list of items and converts them from List to Observable. Awesome!

List<String> myStringList = {"Hello", "What's up", "Nothing Much"};
 
Observable<String> stringsFromList = Observable.from(myStringList);
 
// from here you can work with the Observable the same way as the above examples

We’ve now seen a couple ways to create and subscribe to Observables. Next, I want to get into what makes RxJava really awesome and that is all about it’s composable, functional API that exposes a lot of great higher order functions to filter, reduce, map, aggregate, etc., that you can apply to your Observable stream. If we were just interested in reactive programming then we could achieve something pretty close to what we have done so far just by using callbacks or futures.

Always keep in mind what it is that we are trying to achieve and how RxJava makes this easier. We are trying to build systems now that rely on asynchronous micro-services.

Our client code will be calling one to many micro-services asynchronously to compose data for use (remembering that this means we may not know when and which services will respond first). So how can we build a system this way? Well up until now in Java, we’ve only had a few options. We could use callbacks (a piece of executable code that is passed as an argument to other code, which is expected to call back (execute) the argument at some time in the future), or we could use Java Futures. Callbacks (if you’ve ever worked with them, you know) can quickly become unmanageable once a few callbacks themselves have callbacks. You get into what is know as “Callback Hell” which makes your code unreadable, hard to debug, and even harder to compose effectively. The image below shows what callback hell might look like…

avoiding-callback-hell-in-node-js-using-promises-5-638

So how about our other option? Java Futures? Let’s compare them to RxJava and see how they are different.

Futures

Futures were introduced in Java 5. They are objects that promise to hold the result of something that (maybe) hasn’t occurred yet. They are created, for example, when a task (i.e Runnable orCallable) is submitted to an executor. The caller can use the the future to check whether the task.isDone(), or wait for it to finish using get(). Example:

/**
* A task that sleeps for a second, then returns 1
**/
public static class MyCallable implements Callable<Integer> {
    @Override
    public Integer call() throws Exception {
        Thread.sleep(1000);
        return 1;
    }
}
public static void main(String[] args) throws Exception{
    ExecutorService exec = Executors.newSingleThreadExecutor();
    Future<Integer> f = exec.submit(new MyCallable());
    System.out.println(f.isDone()); //False
    System.out.println(f.get()); //Waits until the task is done(blocking!), then prints 1
}

CompletableFutures

CompletableFutures were introduced in Java 8, and are in fact an evolution of regular Futures inspired by Google’s Listenable Futures. They are Futures that also allow you to string tasks together in a chain. You can use them to tell some worker thread to “go do some task X, and when you’re done, go do this other thing using the result of X”. Here’s a simple example:

/**
* A supplier that sleeps for a second, and then returns one
**/
public static class MySupplier implements Supplier<Integer> {
    @Override
    public Integer get() {
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
        }
        return 1;
    }
}
/**
* A function that adds one to a given Integer
**/
public static class PlusOne implements Function<Integer, Integer> {
    @Override
    public Integer apply(Integer x) {
        return x + 1;
    }
}
public static void main(String[] args) throws Exception {
    ExecutorService exec = Executors.newSingleThreadExecutor();
    CompletableFuture<Integer> f = CompletableFuture.supplyAsync(new MySupplier(), exec);
    System.out.println(f.isDone()); // False
    CompletableFuture<Integer> f2 = f.thenApply(new PlusOne());
    System.out.println(f2.get()); // Waits until the "calculation" is done, then prints 2
}

RxJava

RxJava, as we’ve seen above, is a whole library for reactive programming. Its main advantage over futures is that RxJava works on streams of zero or more items including never-ending streams with an infinite number of items. It can do so asynchronously and with little or no thread blocking. Futures, on the other hand, are single use. Each holds a single “future” result and that’s it. RxJava has a very rich collection of operators and is extremely flexible.

The Observable and Subscriber are independent of the transformational steps in between them.

I can stick as many map() calls as I want in between the original source Observable and its ultimate Subscriber. The system is highly composable: it is easy to manipulate the data. It’s all about the operators in RxJava that give you so much power.

screenshot-from-2016-10-21-11-12-30

So here is a code example that does a ton of work in such a concise, simple, and readable way. Try to achieve this using Futures or callbacks!

// this is a more advanced query that will query the DB, get us back URLs (endpoints to other async services),
// get the title from them, filter them where the title isn't null, take the first 5 we get and preform a save to
// another endpoint all in one composed stream. Pretty amazing. And this is still just scratching the surface of what
// you can do once you learn all of the operators.
 
query("SELECT url FROM EnpointsTable")
    .flatMap(urls -> Observable.from(urls)) 
    .flatMap(url -> getTitle(url))
    .filter(title -> title != null)
    .take(5)
    .doOnNext(title -> saveTitle(title))
    .subscribe(title -> System.out.println(title)); // Remember that all of this is deferred until there is a subscriber

So the next thing you are going to want to do in your “Learning RxJava Journey” is to start looking at all the operators that you have access to. I won’t lie. Some of them can be complicated to understand at first and that is why the great people behind Reactive Programming have come up with a great visual aid to help you understand what each operator is capable of doing. These visual guides are known as Marble Diagrams and they go something like this:

screenshot-from-2016-10-21-11-13-07

To give you an idea of some of the most popular operators you will be using all the time, look at the marble diagrams for them below:

reactive-programming-for-a-demanding-world-building-eventdriven-and-responsive-applications-with-rxjava-16-638

There is so much to learn in RxJava but I hope this quick introduction gets you excited about this technology and makes your life easier once you begin using it! Always keep in mind what RxJava is used for and what it is not. At the heart of it all, RxJava is a library for composing asynchronous and event-based programs by using observable sequences. That’s all I have for you today.

I hope you enjoyed this post, and if you did, please share it! That is the biggest compliment I can get as a blogger. Also, if you want to be notified about any future posts of mine, please subscribe to my blog at jasonroell.com. Have a great day!

Sources:

youtube.com/watch?v=Dk8cR1Kxj0Y, youtube.com/watch?v=QOR69q1e63Y, blog.danlew.net/2014/09/15/grokking-rxjava-part-1/, msdn.microsoft.com/en-us/library/mt693186.aspx

A Simple Introduction To Data Structures: Part One – Linked Lists — August 4, 2016

A Simple Introduction To Data Structures: Part One – Linked Lists

The world of programming is always changing. And changing FAST at that. We are constantly finding better ways to do what it is that we do. That is a great thing. Iteration is a very powerful concept.

However, there are a few ideas and constructs in the computer science world that remain constant. Data structures and their applications are some of those things.

You would think then, that of all things, this would be something that every programmer or software engineer would understand then right? Well, you would be wrong.

I can tell you that out of college, I sure as heck didn’t understand when and why to use one data structure over the other, and what I’ve found out is, neither do many of the programmers now a days that learn programming by doing (code bootcamps, online courses that get your hands dirty, building software out of their basement).

To be honest, I remember thinking that they really weren’t that important. I thought that they were only needed in special cases and maybe for code that was writing for public frameworks or libraries.

Boy, was I wrong.

Understanding how to efficiently use data structures can easily separate a good developer from a bad one. However, really getting a firm grasp on them can be difficult for some people that have a harder time grasping such abstract concepts. Just try to read the defacto  book on the subject cover to cover (“Introduction To Algorithms” – side note: I know it says “Algorithms” but it really covers how data structures are built to lend themselves to certain algorithms).

In that book, and many others like it, you will find many mathematical proofs and ideas that seem very theoretical and the abstract nature of it all really makes it difficult to understand the practical use cases when actually developing software.

So what is a programmer to do if he didn’t graduate with a degree in mathematics?

When many developers first realize how important data structures are (after trying to write a system that processes millions of records in seconds) they are often presented with books or articles that were written for people with computer science degrees from Stanford.

I want this article to bridge the gap and explain data structures in a more practical way. What I want people to take away from this post is an understanding of why we have different data structures, what they are, and when to use a particular one.

This is going to be a simple introduction, so I will cover the data structures that you will use 95% of the time and leave the other 5% for you to discover on your own.

Let’s get to it then!

First, we need to define what exactly is a data structure. Well, a bunch of smart people have thrown around a lot of complex sounding definitions, but the simplest and really the most accurate way to describe a data structure is to say that a data structure is a particular way of organizing data in a computer so that it can be used efficiently. That is all it is. It is just a way to organize data, much like they way humans organize their bookshelves. You want to organize them in a way that makes them easy to get to what you want.

For example, continuing with the bookshelf analogy, if I wanted to be able to quickly pick out all of the books that I own (let’s say hundreds) that start with the letter ‘T’ or ‘B’ or any other letter, then I would want to organize these books in a way that makes that tasks quick and easy to perform. In this example, it would mean organizing the books in alphabetical order. Simple enough.

However, if the way I was using the bookshelf was different (say I wanted to find all the books that pertained to the subject of physics) then quickly we can see that this organization of books will be problematic and cause me to be very inefficient in finding the books that I want.

The solution here would be to organize the books differently based on how we are going to retrieve them in the most common scenario. In the second scenario, we might have decided to organize the shelves according to the topic. The same goes for data structures and how you are going to typically interact with them.

So let’s start talking about the different ways we could organize our data…AKA the types of common data structures. To kick off this fun topic, we will start with one of my favorite data structures known as…

The Linked List: The building block of more complex data structures

Linked lists are among the simplest and most common data structures. They are also a great place to start because the linked list is a data structure that many other data structures use internally in their own structure.

Understanding linked lists will help your understanding of “pointers” (a value – usually written in hexadecimal – that represents the memory address of another value or start of a sequence of values) and how memory is stored and accessed by the system.

To make things easier, I’ve included a visual representation of what a linked list looks like conceptually.

18201011852980Linear Linked List or One Way List or Singly Linked List

Let’s walk through the picture above and then we will go through why storing data in this way might be a good idea in some cases and a bad idea in others.

First, you can see where the image says “START”. This is what is known as the ‘head’ of the linked list. It is just the starting point that says where the first ‘node’ is located in memory  (address 1000 in this case). The ‘1000’ is a pointer to the location of the next node.

I know what you are thinking. “What the hell is a node??”. At least that is what I was thinking when I first read about linked lists. Well, to put it simply, a ‘node’ is just an object that has two fields. One field is the ‘Information’ field (where you would store some data that you are concerned about) and the other is a Pointer field. All the pointer field holds is the address location of the next node location. That’s it! It’s actually so ridiculously simple that many people over think it.

As you can see in the picture above, after we go to memory address 1000 we encounter a node. In this node, there are two fields we can see. In our case the first field that holds the ‘information’ is storing the character ‘A’. The second field (the Pointer field) is storing the location in memory to the next node (memory location 2000).

Next, we follow the pointer to memory location 2000. Once we arrive there, we encounter our second node. As you can see, this second node has ‘B’ as its information and ‘3000’ as its Pointer value. Nothing new here!

Finally, we follow the Pointer value to the memory location (3000) and come to yet another node. This node is almost the same as the previous two. It has a value of ‘C’ for its information field, but as you can see, it has nothing (or null) for its Pointer field. This just means that we have come to the end of our list. There are no more pointers to follow! This node (being the last node) is known as the ‘Tail Node’.

So what makes a Linked List valuable as a data structure and when and why might we choose to use one? Let’s cover that next.

The structure of a Linked List allows it to have many properties that make it different than other collection structures (such as an Array which I will cover in a later post). Because of its use of pointers instead of contiguous memory blocks Linked List are great for the following:

  1. When you need constant-time insertions/deletions from the list (such as in real-time computing where time predictability is absolutely critical)
  2. When you don’t know how many items will be in the list. With arrays, you may need to re-declare and copy memory if the array grows too big (once again, I will go into more detail about arrays in a later post.)
  3. When you don’t need random access to any elements
  4. When you want to be able to insert items in the middle of the list (such as a priority queue – another data structure I will cover down the road)

However, there are times when using a Linked List would be very inefficient and you would be better off with another collection data structure like an Array.

Don’t use a Linked List in these scenarios:

  1. When you need indexed/random access to elements
  2. When you know the number of elements in the array ahead of time so that you can allocate the correct amount of memory for the array
  3. When you need speed when iterating through all the elements in sequence. You can use pointer math on the array to access each element, whereas you need to lookup the node based on the pointer for each element in linked list, which may result in page faults which may result in performance hits.
  4. When memory is a concern. Filled arrays take up less memory than linked lists. Each element in the array is just the data. Each linked list node requires the data as well as one (or more) pointers to the other elements in the linked list.

As you can see, depending on what you are trying to do, Linked List may be a great data structure or a very inefficient one. This all goes back to understanding the pros and cons of each data structure and choosing the right ‘tool’ for the job.

Hopefully, this was a quick and simple introduction to why data structures are important to learn and shed some light on when and why Linked List are an important starting point for data structures.

Next, up I will be covering Arrays, Stacks, and Queues. Stay tuned!

If you can think of any better ways of explaining Linked Lists or why data structures are important to understand, leave them in the comments!

If you liked this post, please share it with others! That is the biggest compliment I could receive. Also, please subscribe to my blog (jasonroell.com) if you are a technology enthusiast!  Have a great day and learn on!

MOCKS: What are they? When should you use them? — April 19, 2016

MOCKS: What are they? When should you use them?

A couple weeks ago I wrote a post about unit testing and its importance in developing quality software. You can find that here. A topic that is seen a lot when discussing unit testing is the idea of “mocking” objects (I’ll explain this in a second).

I didn’t go into this in my unit testing post because I wanted to keep it simple and convey the idea that unit testing could be a valuable tool if used correctly. I didn’t want to delve too much into the complexities of some of the actual implementations details of good unit tests.

However, I received a question via email last week (something which I encourage all of my readers to do!) that said he understood the value of unit testing but was still a little confused on the benefits of using “mocks”. This gave me the idea for this post!

So let’s first rewind a little and first define what “mocking” objects really means. Mocking is “faking” an object. If you look up the noun mock in the dictionary you will find that one of the definitions of the word is something made as an imitation. Sometimes we want to switch our some of our applications dependencies with “fake” (aka mocks) objects. I’ll go into more detail why in a second but let’s first get something else straight.

Mocking goes hand in hand with the concept of unit testing and Separation of Concerns (SoC). We already talked a little bit about unit testing the idea of SoC is a little different. Let me go into a little more detail to but still take a pretty high-level approach.

Formally, SoC is a design principle for separating a computer program into distinct sections, such that each section addresses a separate concern.

Concerns are the different aspects of software functionality. For instance, the “business logic” of software is a concern, and the interface or view through which a person uses this logic is another.

The separation of concerns is keeping the code for each of these concerns separate. Changing the interface/view should not require changing the business logic code, and vice versa. Model-View-Controller (MVC) design pattern is an excellent example of separating these concerns for better software maintainability.

So now that you are an expert in SoC, how does SoC and unit testing go together? Well there are a few principles they all good unit tests follow and that is that they are:

  • Automatic : Invoking of tests (as well as checking results for PASS/FAIL) should be automatic
  • Thorough: Coverage; Although bugs tend to cluster around certain regions in the code, ensure that you test all key paths and scenarios.
  • Repeatable: Tests should produce the same results each time…every time. Tests should not rely on uncontrollable params.
  • Independent: Very important.
    • Tests should test only one thing at a time. Multiple assertions are okay as long as they are all testing one feature/behavior. When a test fails, it should pinpoint the location of the problem.
    • Tests should not rely on each other – Isolated. No assumptions about order of test execution. Ensure ‘clean slate’ before each test by using setup/teardown appropriately
  • Professional: In the long run you’ll have as much test code as production (if not more), therefore, follow the same standard of good-design for your test code. Well factored methods-classes with intention-revealing names, No duplication, tests with good names, etc.
  • Good tests also run Fast. any test that takes over half a second to run.. needs to be worked upon. The longer the test suite takes for a run.. the less frequently it will be run. The more changes the dev will try to sneak between runs.. if anything breaks.. it will take longer to figure out which change was the culprit.

Each of these are very important, but two areas stand out to me as the most important of all — Repeatable and Independent. Mocks help you achieve developing unit tests that are repeatable and independent and follow SoC. Let’s dive a little deeper.

An object under test may have dependencies (a direct violation of good unit test from above!) on other (complex) objects. To fix this and to isolate the behavior of the object you want to test you replace the other objects by mocks that simulate the behavior of the real objects. This is useful if the real objects are impractical to incorporate into the unit test.

In short, mocking is creating objects that simulate the behavior of real objects. The true purpose of mocking is to achieve real isolation.

Let’s go through an example. Say you have a class CustomerService, that depends on a CustomerRepository. You write a few unit tests covering the features provided by CustomerService. They all pass.

A month later, a few changes were made, and suddenly your  unit tests start failing – and you need to find where the problem is.

A logical person would assume “Hey, my unit tests for class CustomerService are failing. The problem must be there!”

However, because you introduced a dependency (CustomerRepository) the problem could actually be there! If any of a classes dependencies fail, chances are the class under test will fail too.

Now picture a huge chain of dependencies: A depends on B, B depends on C  , C depends on D. If a fault is introduced in D, all your unit tests will fail.

And that’s why you need to isolate the class under test from its dependencies (may it be a domain object, a database connection, file resources, etc). You want to test a unit.

Let’s look at some further obstacles that could be introduced without using mocks. Consider testing a large web application. Let’s assume that the unit tests use no mocks at all. What problems will it face?

Well, the execution of the test suite will probably be very slow: dozens of minutes — perhaps hours. Web servers, databases, and services over the network run thousands of times slower than computer instructions; thus impeding the speed of the tests. The cost of testing just one statement within a business rule may require many database queries and many web server round-trips.

The tests are sensitive to faults in parts of the system that are not related to what is being tested. For example: Network timings can be thrown off by unexpected computer load. Databases may contain extra or missing rows. Configuration files may have been modified. Memory might be consumed by some other process. The test suite may require special network connections that are down. The test suite may require a special execution platform, similar to the production system.

Now before you go mocking everything, step back and decide what actually needs to be mocked and where you might just be adding complexity to your project and code. As always, your mileage may vary, and depending on what you are developing you may be able to have perfectly okay unit tests with some of the classes having minor dependencies (especially if these dependencies rarely changes).

My guideline for mocking when you are just getting started is to mock across architecturally significant boundaries (database, web server, and any external service), but not within those boundaries.

Of course, this is just a guideline and shouldn’t be applied to every project, but I believe it is a good starting point.

I hope this clears up what mocks are, why we use them, and how they can be beneficial. If you have any questions, please leave them in the comments!

If you liked this post, please share it with others! That is the biggest compliment I could receive. Also, please subscribe to my blog (jasonroell.com) if you are a technology enthusiast!  Have a great day and learn on!

What Makes A Great Teammate? — March 25, 2016

What Makes A Great Teammate?

Throughout my career (and life) I’ve been on many different teams. Obviously, this implies that I’ve been a teammate to others and had many teammates of my own.

I bring this up because yesterday I had an interaction with a current teammate of mine that made me think “Wow, what a great person to work with.” This led me to wonder if anyone has ever said this about myself. Then, I began to question what really is the hallmark of a great teammate. Someone that you look forward to working with every day and someone that you would be generally sad to see join another team.

As technical people it is so easy to get caught up in our work; problems are there to be solved and it is easy to shift your focus and neglect the world outside.  But becoming a great teammate can be one of the best ways to pave a great career path in the future.

So what makes an awesome teammate?

I think a good place to start when trying to answer this question would be to look back over the years of the people you know that everyone loved to work with. What were their qualities? Why did everyone like them so much? Why did youlike them so much?

Next, after you have a handful of people that you have determined were great teammates, think about the other end of the spectrum. Who were the people that you knew on teams that everyone avoided? Why did others dislike working with them? What made interacting with them so unenjoyable.

After asking myself these questions, I made a few observations.

Be positive, and have a good outlook.

This is pretty simple and should probably come as obvious.  At its core though is having a good attitude and looking at each day and project as an opportunity. Not just to get a lot done, but also to make steps at being a better person. This means not just having a good attitude when you’re haveing a great day, this means even in the face of very stressful situations – like when your website goes down, or there the system stops behaving correctly during a launch event – you stay positive and calm and work together with your team to provide a solution. Getting pissed and yelling at everyone never solved any problems (but I’m sure it has created a few).

The people I recognized as good teammates were generally happy, positive people. They recognized the best in people and situations; they pulled their own weight and were enthusiastic to help when  needed.

On the other hand, people that weren’t as great to work with were mostly the opposite. They had a negative attitude about the project or team or even the day of the week it was. They weren’t enthusiastic about helping others or getting out of their comfort zone. Even on sunny days when everything was going right, they still seemed to have a problem with something or somebody. Don’t be this person! Nobody wants to listen to Debbie Downer all day long.

Bring solutions, not problems

This one hits home big for me. In the beginning of my career, I used to work with someone that always had a problem with the way something was done or an idea that was brought up. This person was definitely a pessimist and if you brought up an idea or solution to him he would meticulously pick it apart and tell you all the reasons it wouldn’t work. This was especially unfortunate because he was the team lead of our project. Eventually, people stopped sharing their ideas with him in fear that it would get shut down and they would be ridiculed for bringing up such an absurd solution.

This finally got to me, and one day I asked him what he would do to address his concerns. I asked if he would start thinking about the problems from all sides, and start coming up with ideas rather than shutting everyone else’s down. Eventually, he started coming up with a list of things that could be done to mitigate the problems. Just coming up with some potential solutions makes the conversations much more pleasant.

Next time you want to tear down someone else’s idea, remember that they at least came up with an idea to try to help solve the problem (you didn’t). Unless you have something that would work better as a solution, try to find a way to mold their idea into something that could work rather than picking it apart for all the reasons that it won’t.

Always take the blame, but never take the credit.

This has been called out in Jim Collins book “Good to Great” and for good reason. This is one of the most important aspects of a good teammate and a good leader.

Everyone wants to be liked. I know I do. If you understand this about people, it can really empower you to be a better person all around.

When you achieve success, generally it was through the help of many people. Making sure you recognize this fact and never slight anyone is very key. Make sure to use “We” instead of “I” when talking about any of your accomplishments. Even if you were the one that did 99% of the work to complete a project, praise Joe for the 1% that he completed! I can guarantee that it will make a world of difference to him, and if you don’t he might even hold a grudge against you for it. People want to be appreciated, even if it is for the tiniest of tasks. Let them know that you recognize the work that they have done and are glad to have them helping you.

On the flip side, if a project fails or a major bug gets released into production, always take the blame. There is something to be said about having full ownership with everything you do in life and understanding that there is no reason to blame anyone else. Take responsibility and learn from your mistakes, even if your mistake was hiring a developer that didn’t understand that he should encrypt credit card numbers in the application and sent them over the wire in plain text…(uh-oh!).

While it may be mostly someone else’s fault, standing up and owning your part prevents others from feeling slighted.  Plus, people want to work with people who own their mistakes and failures.  Of course, a key part of this is also addressing the cause, or mitigating it in the future.   Regardless, don’t blame other people or point fingers, real leaders and great teammates take ownership and then strive to do better next time.

(For more on taking full ownership, read this great book written by retired Navy SEAL Jocko Willink –> http://www.amazon.com/Extreme-Ownership-U-S-Navy-SEALs-ebook/dp/B00VE4Y0Z2/ref=tmm_kin_swatch_0?_encoding=UTF8&qid=&sr=)

These are just some of the most important qualities that I believe makes a great teammate. When you thought about the people you enjoyed working with the most, what qualities did you find? Let me know in the comments!

If you liked this post, please share it with others! That is the biggest compliment I could receive. Please subscribe if you are a technology enthusiast!

 

 

Unit Testing. Is it Worth It? — March 10, 2016

Unit Testing. Is it Worth It?

SPOILER ALERT: Yes.

Code without tests is such an old idea that in his book “Working Effectively with Legacy Code” (written in 2004!) Michael Feather defines “Legacy Code” as code that is not accompanied by unit tests.

This should be common knowledge by now in the software community, but I still am running into developers that do not see the incredible importance of unit testing.

Like many folks, I have long admired parts of “Uncle Bob” Martin’s work. While I disagree with him, sometimes strongly, on some of his views, there is no question that a meme he has long pushed is absolutely spot on: Checking in code without accompanying tests is unacceptable.

I’m not here advocating Martin’s larger solution (test-driven development), but rather acknowledging the rectitude of his fundamental position. To be fair, Martin is not the only person, nor even the most prominent, to advocate the value of checking in code and tests at the same time. Kent Beck, Michael Feathers, and many of the exponents behind continuous integration and DevOps have long articulated this position. But Martin has tirelessly championed it and, in large part because of his efforts, most diligent developers today reflexively understand the importance of writing tests that immediately exercise their new code.

J. Timothy King has a nice piece on the twelve benefits of writing unit tests first. Unfortunately, he seriously undermines his message by ending with this:

However, if you are one of the [coders who won’t give up code-first], one of those curmudgeon coders who would rather be right than to design good software, well, you truly have my pity.

Extending your pity to anyone who doesn’t agree with you isn’t exactly the most effective way to get your message across.

Consider Mr. T. He’s been pitying fools since the early 80’s, and the world is still awash in foolishness.

Mr-T-mrt-36834265-320-254

It’s too bad, because the message is an important one. The general adoption of unit testing is one of the most fundamental advances in software development in the last 5 to 7 years.

For anyone new out there or someone not familiar with Unit Tests, let’s first get a formal definition so that we are all on the same page. Kapeesh?

What is Unit Testing?

Essentially, a unit test is a method that instantiates a small portion of our application and verifies its behavior independently from other parts. A typical unit test contains 3 phases: First, it initializes a small piece of an application it wants to test (also known as the system under test, or SUT), then it applies some stimulus to the system under test (usually by calling a method on it), and finally, it observes the resulting behavior. If the observed behavior is consistent with the expectations, the unit test passes, otherwise, it fails, indicating that there is a problem somewhere in the system under test. These three unit test phases are also known as Arrange, Act and Assert, or simply AAA.

A unit test can verify different behavioral aspects of the system under test, but most likely it will fall into one of the following two categories: state-based or interaction-based. Verifying that the system under test produces correct results, or that its resulting state is correct, is called state-based unit testing, while verifying that it properly invokes certain methods is called interaction-based unit testing.

King presents a list of 12 specific ways adopting a test-first mentality has helped him write better code:

  1. Unit tests prove that your code actually works
  2. You get a low-level regression-test suite
  3. You can improve the design without breaking it
  4. It’s more fun to code with them than without
  5. They demonstrate concrete progress
  6. Unit tests are a form of sample code
  7. It forces you to plan before you code
  8. It reduces the cost of bugs
  9. It’s even better than code inspections
  10. It virtually eliminates coder’s block
  11. Unit tests make better designs
  12. It’s faster than writing code without tests

Even if you only agree with a quarter of the items on that list– and I’d say at least half of them are true in my experience– that is a huge step forward for software developers. You’ll get no argument from me on the overall importance of unit tests. I’ve increasingly come to believe that unit tests are so important that they should be a first-class language construct.

Obviously, writing testable code requires some discipline, concentration, and extra effort. But software development is a complex mental activity anyway, and we should always be careful, and avoid recklessly throwing together new code from the top of our heads.

As a reward, we’ll end up with clean, easy-to-maintain, loosely coupled, and reusable APIs, that won’t damage developers’ brains when they try to understand it. After all, the ultimate advantage of testable code is not only the testability itself, but the ability to easily understand, maintain and extend that code as well.

I encourage developers to see the value of unit testing; I urge them to get into the habit of writing structured tests alongside their code. That small change in mindset could eventually lead to bigger shifts like test-first development — but you have to crawl before you can sprint.

If you liked this post, please share it with others! That is the biggest compliment I could receive. Please subscribe if you are a technology enthusiast!

References: http://www.drdobbs.com/testing/the-embarrassing-costs-of-not-testing-yo/240162967

http://blog.codinghorror.com/i-pity-the-fool-who-doesnt-write-unit-tests/

https://www.toptal.com/qa/how-to-write-testable-code-and-why-it-matters

 

 

The Books That Have Changed My Life (and will change yours) — March 3, 2016

The Books That Have Changed My Life (and will change yours)

If you know me personally, you know that I love to read. It started in grade school where the elementary school I attended offered points and prizes for those who read (and passed tests on) the most books every month.

I remember I graduated with the 3rd most points in the history of the program (something like 10 years). However, the reason that I grabbed a book and started reading so much wasn’t because of the points and prizes that the students were rewarded with – it was because getting lost in a series like “The Lord of The Rings” or reading an autobiography by one of history’s most famous individuals (“My Life and Work” (Henry Ford)) was so much more entertaining than watching a quick movie or a TV show. The latter mediums just couldn’t capture magic the same way a book could.

This continued into college, but by then, I really became most interested in non-fiction books (though TLOR will always hold a special place in my heart). I kept reading biographies of people that inspired me or read books written by those that I respected or wanted to imitate in a particular skill or expertise.

I always found it interesting that so many people rely just on what their professors, teachers, or parents educate them about, instead of supplementing this education with some of the work of the actual masters of a particular domain.

I think that it is incredible that any average human being can pick up a book written by a world expert for only a few bucks! This luxury that we have should not go unappreciated. If you learn how to self-educate yourself by reading and, more importantly, putting what you have read into practice, you will gain a much more robust education than even the top schools could ever rival.

For this reason, it is my goal to read about 100 books a year on topics that I either want to improve on or just find very interesting. Education should never stop once you have finished school. In fact, this should be just the beginning. Through schooling you learn how to learn and the discipline of learning. Now that you have that, you are set free to study the great works and wisdom of the most influential and successful people of the world! Exciting isn’t it??

I think so anyway.

This last year I read a lot of great influential books that impacted my life. Many people think that I only read computer science books because that is my career and one of my passions. However, I find that since I code for my day job 8 hours a day (and have to read a lot of technical literature for that), I get a lot of this knowledge just from experience. Sure there is a lot I still don’t know in this domain, but there is so much more in the world that I could master that would pay much higher dividends.

What I really like to read about falls into twelve broad categories:

  1. AUTOBIOGRAPHIES/BIOGRAPHIES
  2. BUSINESS & ENTREPRENEURSHIP
  3. GETTING THINGS DONE
  4. HAPPINESS/MINDFULNESS
  5. HEALTH
  6. INFLUENCE & PSYCHOLOGY
  7. INTERNET MARKETING
  8. MARKETING
  9. MONEY
  10. PHILOSOPHY
  11. RELATIONSHIPS
  12. SUCCESS

These categories were chosen because I believe these are some of the main areas in life that one can improve on and can learn from the people who have proven they are the best and most knowledgeable in their particular field.

As the title of this posts implies, these books below have changed many different aspects of my life and I try to take at least one main principle from every book and apply it to my life. This has resulted in a much happier, productive, healthy, and successful me!

I promise if you read the books on this list, it will change your life as well. 

Without further ado, the list!

THE LIST

AUTOBIOGRAPHIES/BIOGRAPHIES 
  1. Steve Jobs
  2. Made In America
  3. My Life and Work (Henry Ford)
  4. The Autobiography of Andrew Carnegie
  5. Pour Your Heart Into It
  6. Losing My Virginity
  7. Total Recall
  8. The Everything Store
  9. The Snowball
  10. The Score Takes Care of Itself
BUSINESS & ENTREPRENEURSHIP
  1. The 4 Hour Workweek
  2. The E-Myth Revisited
  3. The Hard Thing About Hard Things
  4. Good To Great
  5. Made To Stick
  6. Switch: How To Change Things When Change Is Hard
GETTING THINGS DONE
  1. Flow
  2. The Power of Full Engagement
  3. The One Thing
  4. Leaders Eat Last
  5. The Effective Executive
HAPPINESS/MINDFULNESS
  1. The Power of Now
  2. Peace Is Every Step
  3. Loving What Is
  4. The Happiness Advantage
HEALTH
  1. The Story of the Human Body
  2. 50 Secrets of the World’s Longest Living People
  3. The Primal Blueprint
INFLUENCE & PSYCHOLOGY
  1. Influence
  2. Secrets of Power Negotiating
  3. Introducing NLP
  4. Predictably Irrational
  5. The Believing Brain
INTERNET MARKETING 
  1. Dotcom Secrets
  2. 80/20 Sales and Marketing
MARKETING
  1. Breakthrough Advertising
  2. The 22 Immutable Laws of Marketing
  3. Positioning
  4. Purple Cow
  5. Scientific Advertising
MONEY
  1. The Richest Man in Babylon
  2. Rich Dad, Poor Dad
  3. Money: Master The Game
  4. The Warren Buffett Way
  5. Financial Intelligence for Entrepreneurs
PHILOSOPHY
  1. On The Shortness of Life
  2. Meditations
  3. Letters from a Stoic
  4. Ralph Waldo Emerson’s Essays
RELATIONSHIPS
  1. How To Win Friends and Influence People
  2. The Go-Giver
  3. Difficult Conversations
SUCCESS
  1. Think and Grow Rich
  2. The Slight Edge
  3. Outwitting the Devil
  4. The Magic of Thinking Big
  5. Psycho-Cybernetics

Well, that is the list (for now)! All of these books can be found on Amazon and I believe all of them are affordable ($10-$30) which is crazy for how much value they have brought into my life.

Read them, learn them, love them, LIVE THEM!

If you have any other books that you believe should be added to the list, PLEASE LET ME KNOW IN THE COMMENTS!

If you liked this post, please share it with others! That is the biggest compliment I could receive and encourages me to keep writing.

Please subscribe to this blog if you are a technology enthusiast!

Have a great day!

“The Cloud”: A Beginners Introduction To Cloud Providers — January 15, 2016

“The Cloud”: A Beginners Introduction To Cloud Providers

Not a day goes by anymore where you don’t hear someone mention “The Cloud” in some context. Cloud services and solutions are all the rage these days.

Why though? What do these solutions offer to businesses that they don’t already have?What are the advantages of “The Cloud”? Well, as it turns out, there are many. Below I have listed just a few of the most popular reasons businesses are migrating:

  1. Achieve economies of scale – increase volume output or productivity with fewer people. Your cost per unit, project or product plummets.
  2. Reduce spending on technology infrastructure. Maintain easy access to your information with minimal upfront spending. Pay as you go (weekly, quarterly or yearly), based on demand.
  3. Globalize your workforce on the cheap. People worldwide can access the cloud, provided they have an Internet connection.
  4. Streamline processes. Get more work done in less time with less people.
  5. Reduce capital costs. There’s no need to spend big money on hardware, software or licensing fees.
  6. Improve accessibility. You have access anytime, anywhere, making your life so much easier!
  7. Monitor projects more effectively. Stay within budget and ahead of completion cycle times.
  8. Less personnel training is needed. It takes fewer people to do more work on a cloud, with a minimal learning curve on hardware and software issues.
  9. Minimize licensing new software. Stretch and grow without the need to buy expensive software licenses or programs.
  10. Improve flexibility. You can change direction without serious “people” or “financial” issues at stake.

As you can see, the Cloud can offer a competitive advantage for companies that understand how to utilize its strengths.

So have I convinced you that the Cloud is something you should embrace yet? I hope so. I want your next question to be

“Jason, we love the Cloud, but what Cloud Service Provider should we use??!”

Well, I am glad you asked! As it turns out, there are currently three great choices to choose from and each offers their flavor of cloud for you to choose from. These three big players are Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP).

Amazon Webservices has been around the longest, and it is (by most accounts) the most mature as of this writing. That doesn’t necessarily mean it is the one for you though.

While AWS is the clear leader in the market (as it has been since its inception in 2006), Azure is the fastest growing cloud provider, with triple digit growth in 2014 and 2015. GCP, far behind in market share, is still considered a top visionary based on the completeness of their offering, go-to-market strategy, enhanced performance and global infrastructure.

When evaluating a cloud solution, there are going to be a lot of questions you need to ask yourself regarding how you will be utilizing the cloud solution, and what services you need to use, and which services you do not care about.

However, when deciding to move your technology solution to the cloud, there are four areas you are most likely to be concerned about. These areas are going to be:

  1. Global Network/Networking
  2. Storage
  3. Computing
  4. Pricing Structure

These four areas are what many IT organizations evaluate when they are looking to choose one Cloud Service Provider over another. Now the details of each of these are constantly changing, so make sure you read up on each of the companies offering website to make sure if this article is still accurate, but I made my best effort to keep it accurate at the time of this writing.

So let’s get into it. Which Platform is right for you?

Global Network/Networking

AWS has the widest range of region availability with eleven regions: Three in the US, two in the EU, two in East Asia, one in China, one in Australia and one in South America, as well as another region in the US for the sole use of US government agencies (AWS GovCloud). Each of these regions is comprised of multiple data centers or “Availability Zones” and edge locations. They use private networks for the data centers’ connectivity within a region, and the Internet for inter-region connectivity. ()

Azure runs 17 data centers, referred to by Microsoft as “regions” across Asia, Australia, Europe and North America. Lacking their own network infrastructure, they use the Internet for data center connectivity, labeling the data with QoS tags.

GCP has only three regions: Central US, Western Europe, and East Asia. Each of these regions is comprised of multiple data centers or “Zones”. Contrary to AWS and Azure, however, connectivity between data centers is done based on Google’s private global network, both on the regional level, and between regions.

(source: CloudAcademy)

Storage/IO

AWS provides ephemeral (temporary) storage that is allocated once an instance is started and is destroyed when the instance is terminated. It provides Block Storage that is equivalent to hard disks, in that it can either be attached to any instance or kept separate. AWS also offers object storage with their S3 Service, and archiving services with Glacier. AWS fully supports relational and NoSQL databases and Big Data.

Google’s Cloud Platform similarly provides both temporary storage and persistent disks. For Object storage, GCP has Google Cloud Storage. GCP supports relational DBs through Google Cloud SQL. Technologies pioneered by Google, like Big Query, Big Table, and Hadoop, are naturally fully supported. Google’s Nearline offers archiving as cheap as Glacier, but with virtually no latency on recovery.

Azure uses temporary storage (D drive) and Page Blobs (Microsoft’s Block Storage option) for VM-based volumes. Block Blobs and Files serve for Object Storage. Azure supports both relational and NoSQL databases, and Big Data, through Windows Azure Table and HDInsight.

Computing/Speed

When configuring a deployment, there are multiple parameters that can affect speed. These include the number and generation of CPU cores, number of instances, network speed, caching, and storage type. While speed is important, it is only one parameter out of many that should be considered in vendor selection, so benchmark test results should be taken with a grain of salt. Overall, based on a test conducted by InfoWorld in 2014 (image below), of the three vendors, GCP is fastest, followed by AWS and Azure.

(source: www.cloudyn.com)

dacapo-results-time-100422580-orig

(Image source: InfoWorld)

 

Pricing Structure

Calculating instance cost is a complex task that requires looking at computing power, memory resources, and networking needs, then attempting to calculate project lifecycle needs for proper capacity, resource and personnel planning.

The various tools provided by the three vendors aim to make this process easier, but it still leaves many in the dark.

AWS charges customers by rounding up the number of hours used, so the minimum use is one hour. AWS instances can be purchased using any one of three models:

  • on demand – customers pay for what they use without any upfront cost
  • reserved – customers reserve instances for 1 or 3 years with an upfront cost that is based on the utilization
  • spot – customers bid for the extra capacity available

GCP charges for instances by rounding up the number of minutes used, with a minimum of 10 minutes. Google recently announced new sustained-use pricing for computing services that will offer a simpler and more flexible approach to AWS’s reserved instances. Sustained-use pricing will discount the on-demand baseline hourly rate automatically as a particular instance is used for a larger percentage of the month.

Azure charges customers by rounding up the number of minutes used for on demand. Azure also offers short-term commitments with discounts.

 

dacapo-results-cost-100422581-orig

(Image source: InfoWorld)

 

Pricing Models
AWS Per hour – rounded up On demand, reserved, spot
GCP Per minute – rounded up (minimum 10 minutes) On demand – sustained use
Azure Per minute – rounded up commitments (pre-paid or monthly) On demand – short term commitments (pre-paid or monthly)

AWS vs Azure vs Google: Pricing and Models (source: CloudAcademy)

Summary

As you can see, there is a lot to evaluate when considering which provider to go with. Ultimately, it comes down to what your problem is and what type of solution you are implementing. I hope this quick guide to the top Cloud Providers helps shed a little inside on the similarities and differences between each company as well as some of the questions you should be asking yourself when transitioning to the cloud.

If you found this article useful, please share it or subscribe to my blog! If you didn’t like it, then tell me why in the comments! I write these articles to help other developers that are in the same situation as I am in and want to share insights when I come across them. Tell me what is helpful and what is not! Thanks, and have a great day!

-Jason