FridayAFM - Deep Learning Super Resolution

Written by Héctor Corte-León | Nov 23, 2023 7:12:44 PM

Héctor here, your AFM expert at Nanosurf calling out for people to share their Friday afternoon experiments. Today I'll show you how to upscale images using Python and the openCV library (which uses a deep learning-based method).

The process of estimating a high-resolution image from a low-resolution image is referred to as super-resolution (See Ref 1). It is a James Bond kind of trick where you say "enhance" and the quality of a zoomed image looks better. In other words, it consists in increasing the number of pixels of an image, and deciding which color should go into the new pixels based in the surrounding pixels.

However, since this pixel information is missing, deciding which pixel is "more correct", is difficult, that is why a rough classification of types of images is made in order to decide which method of upscaling is more suitable. The three types are photo, physical, and functional. The images where preserving or increasing the photo-realism is important, are for instance images of landscapes, of people, of trees... In general photo-realism is increased when the image produces in the observer the same result it will obtain if placed in the position where the image was taken. Images where a physical quantity or property is important are considered physical, and increasing the physical realism consists in what a perfect observation apparatus will detect, what infinite sensitivity or position accuracy, or zero noise will generate? Examples could be the stray field from magnetic domains, membrane wall on cells, or emission from single photon sources. Last, images describing actions, or procedures, or parts of a system, are considered functional, and the functional realism is increased when they better describe the things they are representing, examples of these could be furniture assembly manuals, schematic diagrams, or maps. For a different way of explaining it, see the figure below, or check the Ref 2.

So, how this relates to AFM? In general, in AFM you want to increase the physical realism, because that will help you better understand the sample under study. However, physical realism usual requires physical models fit for every specific case. On the other hand, photo-realism has been explored quite a lot and there are easy to use free scripts, so... if we try photo-realistic super-resolution in AFM images, how it will behave?

First let me show you how it works on pictures. For that I got a blast from the past, a Sony Mavica, a digital camera from the early 2000s that stores data on floppy disks (how cool is that by the way?). My test image is of course, the DRIVEAFM.

How an image taken with this camera will look like after super-resolution is applied to it? This what the original and 8-times scaled up looks like. What LapSRN means, will be explained later.

Is it any better? Can't really tell. So here is a zoom-in of a detail and how it looks like after applying different super-resolution "methods".

Now, while the original is legible, it is clear that the letter is not well defined. Do the other methods do any better? EDSR seems to accentuate the ripples on the top right of the "e" letter, so while improving, it has created something that is not really there. FSRCNN is not so bad with the ripples, but the edges of the letter "e", and specially the "V" on the left look pixelated. This again is not in the real object. LapSRN x4 and LapSRN x8 seem to produce the best results, the ripples are not magnified, there is no pixelation, and the edges of the letter "e" look sharp (in this respect, LapSRN x8 looks better than LapSRNx4).

Question now, is what EDSR, FSRCN and LapSRN mean and how to repeat this process with AFM data?

These names correspond to deep learning convolutional neural networks trained to perform super-resolution (here the wiki for deep learning and here for convolutional neural networks and the link to the paper in Ref 3). Deep learning means many layers of neurons between input and output, and convolutional means that there is at leas one convolution layer (a set of neurons is presented with subsets of data, in other words, it is convoluted across the image, in general to identify features). The convolution part is what allows applying the same network to images with different number of pixels. Remember the post about neural Networks and Gwyddion? That was a convolutional neural network, but it was not considered deep learning because it only had one hidden layer.

OK OK, all these definitions about realism and neural networks are good, but I'm no expert and I just need something to make my images look better, I took a low resolution image, it is the only one I got and it looks horrible in the pre-print of the journal, what can I do? Don't panic, lucky for us, these networks have been trained already by experts, they are available only to download, and they have been implemented in the OpenCV Python library, so it only takes a few lines of code to load an image, pass it through the networks and obtain something with higher number of pixels. In fact, I just did that for you, here is the few lines of Python code to select an image (*.png) and generate its supper resolution version.

By the way, I didn't invent anything new here, I pretty much followed the article in here.

import cv2
from cv2 import dnn_superres
from pathlib import Path
import tkinter as tk
from tkinter import filedialog
# Script to perform super resolution on png images.
# The methods (i.e. the neural networks), should be saved in the same folder are this script.
# Define the available methods alongside with the corresponding scaling factor.
methods=dict(EDSR_x4=("EDSR_x4.pb",4,"edsr"),FSRCNN_x4=("FSRCNN_x4.pb",4,"fsrcnn"),LapSRN_x4=("LapSRN_x4.pb",4,"lapsrn"),LapSRN_x8=("LapSRN_x8.pb",8,"lapsrn"))

# Define the upscaling function in OpenCV (i.e. loads the model, runs the model and saves the image.
def upscale(path,model,scale,image,file_path):
    # Read the desired model
    sr.readModel(path)
    # Set the desired model and scale to get correct pre- and post-processing    
    sr.setModel(model, scale)
    # Upscale the image
    result = sr.upsample(image)
    # Save the image
    cv2.imwrite(file_path+"-"+path+"-"+".png", result)
   
# Open an user interface to look for the image or images to be upscaled.
root = tk.Tk()
root.withdraw()
files_path = filedialog.askopenfilenames()
 
#Iterates through all the images selected.
for m in range(len(files_path)):
    name=Path(files_path[m]).stem
    # Read image
    image = cv2.imread(files_path[m])
    # Create an SuperResolution object
    sr = dnn_superres.DnnSuperResImpl_create()
    # This is a reminder in case you wonder how to operate with dictionaries
    # x=[(k, v) for k, v in methods.items()] This is how to extract the keys and values of a dictionary.
    # Iterates through all the networks in methods (for a 2048 by 2048 px takes about two minutes to run)
    # Prints the image name and then applies the upscale (which also saves the output once up scaled)
    print('Image ' + name)
    for k in methods:
        print('Currently doing ' + methods [k][0])
        upscale(methods[k][0],methods[k][2],methods[k][1],image,files_path[m])
   
#Indicates that the script has finished.
print('++++++++++++++++ All done ++++++++++++++++')

You will need to download the networks from these links and place them on the same folder as the script above.

EDSR [3]. This is the best performing model. However, it is also the biggest model and therefor has the biggest file size and slowest inference. You can download it here.
ESPCN [4]. This is a small model with fast and good inference. It can do real-time video upscaling (depending on image size). You can download it here.
FSRCNN [5]. This is also small model with fast and accurate inference. Can also do real-time video upscaling. You can download it here.
LapSRN [6]. This is a medium sized model that can upscale by a factor as high as 8. You can download it here.

With the theory explained (in general terms at least), and the tools ready, it is time to put it to a test. Fortunately, been doing fridayAFM for this long, I have plenty of images to do the tests. The first one comes from a post you might remember, the surface of a rose petal. For those who missed the white rose petal post I reproduce it here.

At the time I didn't noticed it, but the software I used to stich images together, has a maximum resolution for exporting, so the stitch image has lower quality than the images used to create it.

So, maybe with the super resolution script we can regain some of the lost quality? Here are the results:

A part from LapSRN, which I don't think excel in any particular feature, almost all the networks improve one feature or another, some are better on the edges, some on the kinks, and some on trenches. However, overall (and this is a personal preference at this point), FSRCNN seems to bring the results closer to the original.

At this point it is worth mentioning a few considerations. Gold color scale is considered the one that better highlights features for the human eye, that's why I'm using it here. To make the comparison possible, I limit the color scale range to the range of the smallest field of view, otherwise we will be losing contrast which will make it more difficult for the super resolution to match the target. Pixel size and not the number of pixels is what matters in AFM imaging, so when comparing zoom-in with zoom-out images, they were taken in the best scanning conditions for the range and for the pixel size, and it is the different pixel size what is really being compared.

For the next test, one of my favourite topics. MFM on a hard disk drive (have you chek this post about the smallest hard disk drive?). In this case I have zoom-out and zoom-in images to compare, same number of pixels, hence different pixel size.

In this case I struggle to see a big difference between the results, however, there are some fine details that might indeed be different.

Next test is an EPROM IC from 1984. Remember the post about reading 34-years old data? This was a sample that easily blunted tips because of the sharp edges and trenches.

Again the results are very tight, but small differences can be noticed.

Last but not least, the yeast. Remember when I prepared some yeast and imaged it dried? Or when I combined yeast and egg shell membrane?

This is the trickiest one, because in order to highlight the features on top of the yeast surface, these images where heavily "flattened" (removing large scale topography and leaving short scale features intact). This means that using the same color scale limits in both will not produce the same contrast. However, I tried my best to keep them similar in terms of contrast. These are the super resolution results.

Now, whilst in some of the previous cases it was questionable if there was an improvement or not... here the improvement is massive and all the networks perform well. However, there is a winner, EDSR manages to get closer to the x2 AFM zoom.

Let's recap. There has been a network that performed well with all the test samples (maybe not winning in all the cases, but definitely not introducing artefacts), that's the EDSR x4. Does it mean we should go an apply this network to obtain super resolution on all the images we obtain? Answer is not. Your best option will always be trying to obtain more data with the AFM, however, it is a nice tool to have in your inventory, and in some specific cases (like for instance where you know the underlaying structure by other means), it can be used to improve the obtained results or to try acquire new knowledge about your sample.

I hope you find it useful, entertaining, and try it yourselves. Please let me know if you use some of this, and as usual, if you have suggestions or requests, don't hesitate to contact me.