In my previous article I gave an introduction to LIDAR for archaeology. This article aims to take you one step further. It shows an example of processing real world LIDAR point clouds for archaeology, in order to create a digital surface model (DSM).

In today’s example we will be looking at LIDAR data for Tikal, a Mayan city located in Guatemala. This city flourished between 200 and 850 A.D, and became the home to Temple IV, the tallest pre-Columbian structure in the Americas.

We will be going from site, to point cloud, to digital surface model (DSM) in a few…

If you read this blog often, then you have probably already seen some or other article about LIDAR being used in archaeology. This article aims to give you a proper introduction to the technology and its applications in archaeology.

What is LIDAR

LIDAR stands for Light detection and Ranging. It is a method for measuring distance using light. Take the image below as an example:

Image for post
Image for post
Determination of Tree Heights With Unmanned Air Vehicles

A sensor is attached to an airplane. Laser light is shot down, bouncing off various types of surfaces. We can measure distance to an object based on how long it takes for the light to bounce back.

The…

I have spoken a lot in this blog about the process of bringing machine learning code to production. However, once the models are in production you are not done, you are just getting started. The model will have to face its worst enemy: The Real World!

Image for post
Image for post
Image credit to: Tim Smit

This post focuses on what kinds of monitoring you can put in place in order to understand how your model is performing in the real world. This considers both, continuous training as well as the usage of the trained model. It looks into:

  • Monitoring your infrastructure
  • Monitoring the data
  • Monitoring the training
  • Monitoring value…

In the past few years there has been a large increase in tools trying to solve the challenge of bringing machine learning models to production. One thing that these tools seem to have in common is the incorporation of notebooks into production pipelines. This article aims to explain why this drive towards the use of notebooks in production is an anti pattern, giving some suggestions along the way.

What is a notebook?

Let’s start by defining what these are, for those readers who haven’t been exposed to notebooks, or call them by a different name.

Notebooks are web interfaces that allow a user to…

I received a Grove Starter kit at an internal work Conference a few months ago. Of course, I did something entirely useless with it, so here is the tutorial on how to make your own useless Marvin.

Image for post
Image for post

This is Marvin, and he does the following:

  • Complains that he is melting when the temperature is above 30 C°
  • Tells you to turn the heater on if the temperature is below 15 C°
  • Tells you the temperature when it is between 15 C° and 30 C°
  • Swings a pendulum on the touch of the touch sensor
  • Lights up when playing the drum

When it comes to data products, a lot of the time there is a misconception that these cannot be put through automated testing. Although some parts of the pipeline can not go through traditional testing methodologies due to their experimental and stochastic nature, most of the pipeline can. In addition to this, the more unpredictable algorithms can be put through specialised validation processes.

Let’s take a look at traditional testing methodologies and how we can apply these to our data/ML pipelines.

Testing Pyramid

Your standard simplified testing pyramid looks like this:

Image for post
Image for post

This pyramid is a representation of the types of tests that…

The most common approach to deploying machine learning models is to expose an API endpoint. This API endpoint would generally be called via a POST method containing the input data for the model as the body, and responding with the output of the model. However, an API endpoint is not always the most appropriate solution to your use case.

There are, for example, use cases that may require a machine learning model to be deployed on a mobile device, such as:

  • The need to use the model offline or in low connectivity areas.
  • The need to minimize the amount of…

Traditionally neural networks are trained by adjusting weights based on a measure of error being passed back through the network. This error is calculated by comparing the result of the input fed through the network against the expected value. The person creating the neural network would spend some time fiddling with the neural network’s parameters until the network can learn from the given data by adjusting its weights using the said error.

This article is a high level introduction to how evolutionary algorithms can be used to ease this process.

Before I begin, let’s take a look at some existing…

A missing smile at the passing of a palm tree
like a gust of winter wind — cold, yet unseen.
The setting of the clutching branches free
abruptly calm and unexpectedly serene.

A missing smile at the passing of an ice cream
travelling through the void of a mariachi’s guitar
merging into an eternal musical stream
on a journey to becoming the dust of a star

A missing smile at the passing of a beach
silenced by the laughter of a small wave.
Your absence there, yet out of reach
leaving a small but timeless breach

Originally published at http://exploringmycreativity.wordpress.com on July 1, 2018.

This post is about setting up the infrastructure to run yor spark jobs on a cluster hosted on Amazon.

Before we start, here is some terminology that you will need to know:

  • Amazon EMR — The Amazon service that provides a managed Hadoop framework
  • Terraform — A tool for setting up infrastructure using code

At the end of this post you should have an EMR 5.9.0 cluster that is set up in the Frankfurt region with the following tools:

  • Hadoop 2.7.3
  • Spark 2.2.0
  • Zeppelin 0.7.2
  • Ganglia 3.7.2
  • Hive 2.3.0
  • Hue 4.0.1
  • Oozie 4.3.0

By default EMR Spark clusters come with…

Kristina Georgieva

Data scientist, software engineer, poet, writer, blogger, ammature painter

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store