Finding the nearest locations around you using AWS Amplify — Part 2

Gerard Sans
13 min readApr 1, 2020

Create a custom GraphQL distance-aware search using AWS AppSync

This is the second article of a series of articles that provides a comprehensive step by step guide to enable distance-aware searches in your full-stack serverless applications using AWS Amplify and AWS AppSync.

In this article, you will learn how to create a custom GraphQL distance-aware search. First, we will introduce Amazon Elasticsearch and some tools. Then, you will create an index to store your geolocation data and make changes to your GraphQL Schema by adding a new distance-aware search query. The last step, will require you to create a custom GraphQL resolver to search from a location within a given distance using Amazon Elasticsearch including paginating results.

The whole series includes:

Please let me know if you have any questions or want to learn more at @gerardsans.

Introduction to Amazon Elasticsearch service

At the end of Part 1 of this series, you changed your GraphQL Schema to include location coordinates and added @searchable to a type to enable distance-aware searches.

After pushing changes to the cloud, AWS Amplify provisioned a new Amazon Elasticsearch domain to store and query your geolocation data. This domain is an Elasticsearch cluster in the cloud with provisioned compute and storage resources. Let’s talk a bit about what exactly is Amazon Elasticsearch.

Amazon Elasticsearch is a fully managed service that makes it easy for you to deploy, secure, and run open source Elasticsearch at scale. It provides support for Elasticsearch APIs, managed Kibana and integrates with other AWS services.

Introduction to open source Elasticsearch

Elasticsearch is an open source search and analytics engine based on the Lucene library. By using Elasticsearch we can provide advanced search capabilities to our users including the following:

  • Natural language searches: using fuzzy matching, multi match, custom boosting, logical operators, wildcards, regular expressions and range queries.
  • Advanced searches: using terms (structured searches), filters, suggestions, custom relevance score and faceted (Eg: via categories).
  • Geolocation searches: distance from a location or within a custom region (bounding box or polygon).

See a representation of geolocation searches below.

Searches by distance or within a custom region (using a bounding box or polygon).

In this article, we are going to focus on the first type, using the distance from a center location to run our queries.

Elasticsearch architecture

Find an overview of the Elasticsearch architecture in the diagram below. We will introduce Kibana later on.

Elasticsearch Architecture Overview (source elastic.co)

Elasticsearch cluster and data nodes (servers)

An Elasticsearch cluster (domain) is composed by one or more data nodes working together. A data node is a running instance and usually hosted in one server.

In order to explain Elasticsearch we are going to compare it to a relational database. This is only as an analogy to help understanding and is not accurate.

Indexes and mappings (databases and schemas)

An Index works like a database with mappings that hold schema definitions for its internal types. An Index is a logical namespace mapped to one or more shards (primary and replicas).

For our implementation, we will create a new index bikepoint to store our geolocation data. Let’s look at our type definition in our GraphQL Schema.

type BikePoint @model @searchable {
id: ID!
name: String!
description: String
location: Location
bikes: Int
}

Elasticsearch can automatically create the mappings for GraphQL types (ID, String, Int, Float and Boolean) but not for the location field. For this reason, we will manually create a specific mapping for it later on.

Types, fields and documents (tables, columns and rows)

A type works like a table. Each type, has a list of fields and mappings that tell Elasticsearch how to analyse and store them properly. A field works like a column and can contain scalars or complex structures. A document works like a row, each document is stored as a JSON object in the _sourcefield and returned in searches.

There are four basic operations in Elasticsearch:

  • Index. Processing a document and storing it in an index for retrieval.
  • Delete. Removing a document from an index.
  • Update. Removing a document and indexing it as a new document.
  • Search. Retrieving documents or aggregates from one or more indices.

These are some of the basic concepts for Elasticsearch. If you want to learn more check out Elasticsearch introduction.

Finding your new Amazon Elasticsearch domain

In order to access your Amazon Elasticsearch domain, you can run the following command using Amplify CLI selecting GraphQL.

amplify console api

This will open the AWS AppSync console for your GraphQL API. Navigate to Data Sources.

Data Sources for LondonCycles app.

Complete Part 1 of these series if you can’t see them.

The first item, is the Amazon DynamoDB table connected to your GraphQL Schema with @model, for us BikePointTable. The second item, is the Amazon Elasticsearch domain created by @searchable. Follow the resource link, for the latter, to open the Amazon Elasticsearch service dashboard.

Amazon Elasticsearch Dashboard with a new domain created by @searchable.

Now navigate to the new Amazon Elasticsearch domain, to find all related information including Elasticsearch version, endpoint and access to Kibana.

Kibana is an open source data visualization dashboard for Elasticsearch. You can use Kibana Developer Tools to test your search queries.

Amazon Elasticsearch domain details.

Kibana Developer Tools

Along with Elasticsearch, you have access to Kibana Developer Tools. This is an environment you can use to test your search queries. Let’s see how you can access it.

In order to use Kibana securely, install the AWS Agent browser extension and provide your AWS profile information (key id and secret access keys).

You can run the command below to quickly find this information:

$ cat ~/.aws/credentials                           (Linux & Mac)
$ %USERPROFILE%\.aws\credentials > con (Windows)
[default]
aws_access_key_id=<<KEY-ID>>
aws_secret_access_key=<<SECRET-ACCESS-KEY>>

Once you have done this, follow the Kibana link on the Amazon Elasticsearch domain details page and navigate to Developer Tools.

Step 1: setting up a distance-aware Elasticsearch index

To enable distance-aware searches, we need to make sure Elasticsearch is properly setup. Before we make further changes, we need to:

  • Create a new index: use the name of your type, all letters must be in lowercase. Eg: bikepoint.
  • Create a new geolocation mapping: this is so latitude and longitude fields are indexed properly as a geo_point.

You can run the following commands in Kibana Developer Tools.

# Create a new index
PUT /bikepoint
# Create a new mapping to set location type to geo_point
PUT /bikepoint/_mapping/doc
{
"properties": {
"location": {
"type": "geo_point"
}
}
}

Important: after running these commands you can start using your geolocation data in GraphQL mutations.

Step 2: creating a custom GraphQL distance-aware search query

In Part 1, we have seen how to use the default GraphQL search query that AWS AppSync creates. In this section, we will create a custom GraphQL distance-aware search to find items nearby within a distance from a given location.

See below, the UI for this search query used in LondonCycles app. We will build the client UI in Part 3.

Find bikes nearby feature in LondonCycles app.

Add the following code to your GraphQL Schema:

type Query {
nearbyBikeStations(
location: LocationInput!,
m: Int,
limit: Int,
nextToken: String
): ModelBikePointConnection
}
input LocationInput {
lat: Float!
lon: Float!
}
type ModelBikePointConnection {
items: [BikePoint]
total: Int
nextToken: String
}

First, we have the main query, followed by the input for the location argument and the type for the results. The nearbyBikeStations query takes for arguments:

  • location: origin coordinates from where we want to run our search.
  • m: maximum distance from location in meters. Default: 500 meters.
  • limit: how many results we want back. Default: 10 results.
  • nextToken: distance from which to paginate from, used for pagination.

For the results, we are using the Connection pattern. This will allow us to implement pagination and access the total when filtering results.

Step 3: creating a custom AWS AppSync resolver

We will now create a custom AWS AppSync resolver for Amazon Elasticsearch service. This will allow you to use GraphQL to store and retrieve data from the Amazon Elasticsearch domain created by the @searchable GraphQL transform in Part 1.

For an introduction to AWS AppSync resolvers you can read AWS AppSync Velocity Templates Guide.

The way this resolver works is by mapping an incoming GraphQL request into an Amazon Elasticsearch request, and then map the response back to GraphQL like shown in the diagram below.

Custom GraphQL distance-aware search query flow

In Step 1, when a client runs the GraphQL distance-aware search query will create a request to the GraphQL API. Following in Step 2, AWS AppSync will process the resolver request template and run it against Amazon Elasticsearch. In Step 3, AWS AppSync will process the response and run the corresponding resolver response template. Finally in Step 4, AWS AppSync will send back to the client the result of the resolver response template in the form of a GraphQL response JSON object.

To create a custom AWS AppSync resolver then we need to create both a request and a response templates. These will be part of your api. Let’s see where these resources are in your project.

/amplify/backend/api/
TransportForLondonAPI/resolvers
Query.nearbyBikeStations.req.vtl
Query.nearbyBikeStations.res.vtl

Let’s start by looking to an example of the nearbyBikeStations search query. For this example, we want to search nearby bike stations using our users current location that is [-0.134167, 51.510239] as far as 500 meters and limiting our search to 3 results.

query NearbyBikes {
nearbyBikeStations(
location: { lat: 51.510239, lon: -0.134167 },
m: 500,
limit: 3
) {
items {
id
name
location { lat lon }
}
total
}
}

Once we run this query AWS AppSync will process it and run the corresponding resolver request template below.

## Query.nearbyBikeStations.req.vtl#set( $distance = $util.defaultIfNull($ctx.args.m, 500) )
#set( $limit = $util.defaultIfNull($ctx.args.limit, 10) )
{
"version": "2017-02-28",
"operation": "GET",
"path": "/bikepoint/doc/_search",
"params": {
"body": {
#if( $context.args.nextToken )"search_after": ["$context.args.nextToken"], #end
"size" : ${limit},
"query": {
"bool" : { "must" : { "match_all" : {} },
"filter" : {
"geo_distance" : {
"distance" : "${distance}m",
"distance_type": "arc",
"location" : $util.toJson($ctx.args.location) }
}

}
},
"sort": [{
"_geo_distance": {
"location": $util.toJson($ctx.args.location),
"order": "asc",
"unit": "m",
"distance_type": "arc"
}

}]
}
}}

Let’s break out this template. First we take care of default values for distance and limit arguments using the $util.defaultIfNull function.

#set( $distance = $util.defaultIfNull($ctx.args.m, 500) )
#set( $limit = $util.defaultIfNull($ctx.args.limit, 10) )

$ctx.args holds an object that corresponds to our query arguments: location, m, limit and nextToken.

Next we will set the operation and the index for Amazon Elasticsearch

"operation": "GET",  
"path": "/bikepoint/doc/_search",

As part of the search body we set the following:

  • search-after last sort value for pagination purposes.
  • size, number of results.
  • query, query details and filters.
  • sort, sorting fields.

As part of the query we have a boolean clause including must and match_all with an empty clause. This will assign a score of 1.0 to all resulting documents.

"query": {        
"bool" : {
"must" : { "match_all" : {} },
"filter" : {
"geo_distance" : {
"distance" : "${distance}m",
"distance_type": "arc",
"location" : $util.toJson($ctx.args.location)
}
}

}
}

For our query we will use the geo_distancequery passing the location and distance in meters.

We are using arc for accuracy as distance calculations will use the haversine formula which determines the great-circle distance between two points on a sphere given their coordinates.

You can use any of these distance units. Eg: mi, yd, ft, km, m. If you are working with large datasets consider changing distance_type to plane.

This will return all the results within the specified distance. In order to get the results ordered by distance, we need to add the following changes so we get the closest results first.

"sort": [{ 
"_geo_distance": {
"location": $util.toJson($ctx.args.location),
"order": "asc",
"unit": "m",
"distance_type": "arc"
}
}]

Let’s see how a query and its results look like

Example search query and results.

We have covered all the necessary details for the request so now we can look into our response template.

## Query.nearbyBikeStations.res.vtl#set( $items = [] )
#foreach( $entry in $context.result.hits.hits )
#if( !$foreach.hasNext )
#set( $nextToken = "$entry.sort.get(0)" )
#end
$util.qr($items.add($entry.get("_source")))
#end
$util.toJson({
"items": $items,
"total": $ctx.result.hits.total,
"nextToken": $nextToken
})

The first part of the template, is preparing the $itemsarray to return the results. The fields in _sourcematch the data in Amazon DynamoDB BikePoints table including some internal fields: __typename, createdAt, updatedAt.

{
"took": 17,
"hits": {
"total": 12,
"hits": [
{
"_index": "bikepoint",
"_type": "doc",
"_id": "BikePoints_83",
"_source": {
"id": "BikePoints_83",
"__typename": "BikePoint",
"name": "Panton Street, West End",
"location": {
"lon": -0.13151,
"lat": 51.509639
},
"createdAt": "2020-03-02T14:48:44.617Z",
"updatedAt": "2020-03-02T14:48:44.617Z"
},
"sort": [
195.6063243635008
]
}
]
}
}

We are setting the value for $nextTokenwith the sort value 195.6063243635008 to allow pagination. In the next section we will explain in detail how pagination uses this value to paginate results.

Paginating results from Elasticsearch using AWS AppSync

Let’s see an example using the pagination implemented in our custom AWS AppSync resolver for thenearbyBikeStations query.

In order to paginate, we will run our query first and grab the result of nextToken. For this purpose, we are using a pagination with only an item limit:1 .

////////////////////////////////////////////////////////////////////
// First query. We grab nextToken from results
nearbyBikeStations(
location: { lat: 51.510239, lon: -0.134167 }, m: 500, limit: 1
) {
items { id name location { lat lon } }
total
nextToken
}
////////////////////////////////////////////////////////////////////
// Query sent to Elasticsearch
GET /bikepoint/doc/_search
{
"size" : 1,
"query": {...},
"sort": [...]
}
////////////////////////////////////////////////////////////////////
// Result
{
"took": 25,
"hits": {
"total": 12,
"hits": [
{
"_index": "bikepoint",
"_type": "doc",
"_id": "BikePoints_83",
"_source": {
"__typename": "BikePoint",
"name": "Panton Street, West End",
"location": {
"lon": -0.13151,
"lat": 51.509639
},
"id": "BikePoints_83",
},
"sort": [
195.6063243635008
]
}
]
}
}

Our request template translates this query using limit:1 argument to a query in Elasticsearch using size:1. Our response template will pick the sortvalue 195.6063243635008 and return it back as nextToken.

At this point, we want to navigate to the second result so we run the query again but now including the previousnextTokenvalue.

////////////////////////////////////////////////////////////////////
// Second query as user paginates. We use previous nextToken
nearbyBikeStations(
location: { lat: 51.510239, lon: -0.134167 }, m: 500, limit: 1,
nextToken: "195.6063243635008"
) {
items { id name location { lat lon } }
total
nextToken
}
////////////////////////////////////////////////////////////////////
// Query sent to Elasticsearch
GET /bikepoint/doc/_search
{
"search_after": [195.6063243635008],
"size" : 1,
"query": {...},
"sort": [...]
}

By using search_after with the sort value we are able to navigate to the next result. Find how it works in more detail at Elasticsearch search_after documentation.

Step 4: adding custom AWS AppSync resolver to your Amplify project

Using Amplify we can add a custom AWS AppSync resolver by adding its templates to our project. These will be part of CustomResources.jsonin your apistack.

/amplify/backend/api/
TransportForLondonAPI/stacks
CustomResources.json

Add your resolver templates to the Resourcessection as below naming each highlighted entry accordingly to follow your specific naming.

{
"Resources": {
"QueryNearbyBikeStations": {
"Type": "AWS::AppSync::Resolver",
"Properties": {
"ApiId": { "Ref": "AppSyncApiId" },
"DataSourceName": "ElasticSearchDomain",
"TypeName": "Query",
"FieldName": "nearbyBikeStations",
"RequestMappingTemplateS3Location": {
"Fn::Sub": [
"s3://${S3DeploymentBucket}/${S3DeploymentRootKey}/resolvers/Query.nearbyBikeStations.req.vtl", {
"S3DeploymentBucket": { "Ref": "S3DeploymentBucket" },
"S3DeploymentRootKey": { "Ref": "S3DeploymentRootKey" }
}]
},
"ResponseMappingTemplateS3Location": {
"Fn::Sub": [ "s3://${S3DeploymentBucket}/${S3DeploymentRootKey}/resolvers/Query.nearbyBikeStations.res.vtl", {
"S3DeploymentBucket": { "Ref": "S3DeploymentBucket" },
"S3DeploymentRootKey": { "Ref": "S3DeploymentRootKey" }
}]
}
}
},
}

Take special care to make sure template file names and paths match the ones in your project and resources entries don’t conflict with any other resources you may already have.

Pushing changes to the cloud

Run the following command to push all changes to the cloud:

amplify push

Important: enabling searches in your application may incur in costs as shown in the table at the end of this article.

Once the command finishes, you can try your new custom GraphQL distance-aware search query using amplify console api from the command line and selecting GraphQL.

Conclusion

Congratulations! You have learnt how to create a custom GraphQL distance-aware search using AWS Amplify and AWS AppSync together with Amazon Elasticsearch.

In Part 3, we will cover how to create a client UI using Mapbox and Angular to integrate the distance-aware search we created.

Ready to code?

You don’t have an AWS Account? Use the next few minutes to create one and activate the free plan for a whole year. Follow steps at AWS knowledge center.

Once your free plan expires for Amazon Elasticsearch service, you are charged only for instance hours, Amazon EBS storage, and data transfer. See the pricing table overview:

Free tier for a new AWS Account. Check out latest pricing.
Picture with Jane Shih from last Open Up Summit in Taipei, Taiwan.

Thanks for reading!

Have you got any questions regarding this article AWS Amplify or AWS AppSync? Feel free to ping me anytime at @gerardsans.

My Name is Gerard Sans. I am a Developer Advocate at AWS Mobile working with AWS Amplify and AWS AppSync teams.

GraphQL is an open-source data query and manipulation language for APIs.

Elasticsearch is an open source search and analytics engine based on the Lucene library.

Kibana is an open source data visualization dashboard for Elasticsearch.

--

--

Gerard Sans

Helping Devs to succeed #AI #web3 / ex @AWSCloud / Just be AWSome / MC Speaker Trainer Community Leader @web3_london / @ReactEurope @ReactiveConf @ngcruise