QualityPredictionin JetPrintingUsing NeuralNetworks

DEGREE PROJECT IN MECHANICAL ENGINEERING,SECOND CYCLE, 30 CREDITSSTOCKHOLM, SWEDEN 2020

Quality Prediction inJet Printing UsingNeural Networks

Daniel BrunColin Lawless

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF INDUSTRIAL ENGINEERING AND MANAGEMENT

AuthorsDaniel BrunColin LawlessKTH Royal Institute of Technology

Place for ProjectTäby, SwedenMycronic AB

ExaminerHans JohanssonStockholm, SwedenKTH Royal Institute of Technology

Supervisor at KTHCarl DuringStockholm, SwedenKTH Royal Institute of Technology

Supervisor at MycronicGustaf MårtenssonTäby, SwedenMycronic AB

Master’s Thesis Coordinator

Fredrik Asplund

Stockholm, Sweden

KTH Royal Institute of Technology

ii

Master of Science Thesis TRITA-ITM-EX 2020:229

Quality Prediction in Jet Printing

Using Neural Networks

Daniel Brun

Colin Lawless

Approved

2020-06-01

Examiner

Hans Johansson

Supervisor

Carl During

Commissioner

Mycronic

Contact person

Gustaf Mårtensson

Abstract Surface mount technology is widely used in the manufacturing of commercial

electronics, and the demands on the machines increase as the complexity of the

electronics increases and the size of the components decreases. Mycronic is a company

that focuses on addressing those demands with their high-technology jet printing and

pick-and-place machines. This master's thesis has been performed at Mycronic and has

focused on the MY700 jet printer. Due to unknown factors, the quality of the ejected

solder paste droplets from the machine can vary over time. It was therefore of interest

to monitor variables of the MY700 in order to gain more knowledge about the cause of

the varying quality, and also to be able to detect substantial changes in deposit quality.

In this project, the temperature has been measured at three key locations on the

ejector as well as the current going through the piezoelectric actuator. This data was

fed to a neural network in order to make quality predictions with respect to the

diameter of the solder paste deposits. Different combinations of sensor data were used

to evaluate how the different sensors affected the performance of the neural network.

Thereby, a better understanding of how big an impact the different variables had on

the quality of the deposits could be achieved.

The results indicate that the current was more significant than the temperature for

making quality predictions. Using only the temperature data, the neural network was

not able to accurately predict quality deviations, whereas with the piezo current data

or both of them combined, better predictions could be made. The current data also

significantly improved the performance of the neural network when printing jobs with

varying diameters were used. The conclusion is that none of the three temperature

sensors significantly improved the performance, and there were no considerable

differences between them, while the current did improve it.

iii

Examensarbete TRITA-ITM-EX 2020:229

Kvalitetsestimering av jetdispenserad lodpasta

med ett neuralt nätverk

Daniel Brun

Colin Lawless

Godkänt

2020-06-01

Examinator

Hans Johansson

Handledare

Carl During

Uppdragsgivare

Mycronic

Kontaktperson

Gustaf Mårtensson

Sammanfattning Ytmonteringsteknologi är en väletablerad metod som används inom tillverkningen av

kommersiell elektronik, och kravet på dessa maskiner ökar i takt med att elektronikens

komplexitet ökar och storleken på komponenterna minskar. Mycronic är ett företag

vars fokus ligger i att möta dessa krav med deras högteknologiska jet printing- och

pick-and-place-maskiner. Detta examensarbete har utförts på Mycronic och har

fokuserat på jet printing-maskinen MY700. På grund av okända faktorer kan

kvaliteten på den deponerade lodpastan från maskinen variera över tid. Det var därför

intressant att övervaka variabler hos maskinen för att få mer kunskap om orsaken till

den varierande kvaliteten och också för att kunna upptäcka förändringar i kvaliteten.

I det här projektet har temperaturen mätts på tre kritiska positioner på ejektorn

samt även strömmen som går genom det piezoelektriska ställdonet. Dessa data gavs

till ett neuralt nätverk för att göra kvalitetsprognoser med avseende på diametern på

deponeringarna av lodpasta. Olika kombinationer av sensordata användes för att

utvärdera hur de olika sensorerna påverkade det neurala nätverkets prestanda.

Därigenom kunde en bättre förståelse av hur stor påverkan de olika variablerna hade

på kvaliteten på deponeringarna uppnås.

Resultaten indikerar att strömmen var mer betydelsefull än temperaturen för att

göra kvalitetsprognoser. Om bara temperaturdata användes lyckades inte det neurala

nätverket göra exakta förutsägelser för kvalitetsavvikelser, medan med bara strömdata

eller båda kombinerade kunde bättre förutsägelser göras. Strömdatan förbättrade

också prestandan hos det neurala nätverket när jobb med olika diametrar användes.

Slutsatsen är att ingen av de tre temperatursensorerna förbättrade prestandan

signifikant, och det fanns inga betydande skillnader mellan dem, medan strömmen

förbättrade prestandan.

iv

Acknowledgements

Firstly, we want to thank Gustaf Mårtensson, Daniel Grafström and Juan Albahaca for

realizing this project and giving us the opportunity to complete it. You have also given

us your support and encouragement throughout the project, for which we are grateful.

We also want to express our deepest appreciation to all our coworkers for providing

expert knowledge and for supporting us when needed.

A special thanks to our supervisor, Carl During, for his support and feedback

throughout the project.

Tack!

Thank you!

v

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.7 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Frame-of-Reference 92.1 Jetting Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Piezoelectric Actuator . . . . . . . . . . . . . . . . . . . . . . . . 102.1.2 Jet Printing Quality . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Non-Newtonian Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.1 Solder Paste . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.1 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5 Neural Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . 17

2.5.1 Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . 20

3 Methodology 253.1 Research Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Internal and External Validity . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Implementation 294.1 Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.1 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.1.2 Red Pitaya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

vi

CONTENTS

4.2 Software Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.4 Verification and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.4.1 Thermocouples . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4.2 Shunt resistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.4.3 Prediction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5 Results 435.1 BGA Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.2 RT1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.3 Fulfillment of Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Discussion 516.1 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.1.1 BGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.1.2 RT1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.1.3 Neural Network Performance . . . . . . . . . . . . . . . . . . . . 53

6.2 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Research Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7 Conclusions 57

8 Future Work 59

References 61

Appendices 66

A PCB Schematic Overview 66

B Training and Validation Loss 67

C Performance Without Current 68

vii

List of Figures

1.1.1 Ejector of a jet printing machine. . . . . . . . . . . . . . . . . . . . . . 2

1.1.2 An assembly line solution at Mycronic. . . . . . . . . . . . . . . . . . 2

1.2.1 Temperature sensor placement. . . . . . . . . . . . . . . . . . . . . . 4

1.2.2 Current and voltage waveform. . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Simplified illustration of the ejector in a jet printing machine. . . . . 9

2.1.2 Three-phase voltagewaveformof the piezo actuator for a single ejected

solder paste droplet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.3 Two different qualities of a BGA jet printing job. . . . . . . . . . . . . 12

2.1.4 Quality measurement of a single droplet on a substrate. . . . . . . . . 12

2.2.1 Rheological properties of different non-Newtonian fluids. . . . . . . . 13

2.2.2 Shear viscosity as a function of shear rate for two solder paste samples. 14

2.3.1 Simple thermocouple circuit. . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.1 Red Pitaya overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.1 Typical architecture of a fully connected neural network with one

hidden layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.5.2 Structure of an artificial neuron in a neural network. . . . . . . . . . . 18

2.5.3 Circuit diagram of a cell in an RNN. . . . . . . . . . . . . . . . . . . . 20

2.5.4 Illustration of the data flow through an LSTM cell. . . . . . . . . . . . 21

2.5.5 LSTMmodels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1.1 Overview of hardware setup. . . . . . . . . . . . . . . . . . . . . . . . 30

4.1.2 Flow chart illustrating the data transfer. . . . . . . . . . . . . . . . . . 31

4.1.3 The PCB used for data gathering. . . . . . . . . . . . . . . . . . . . . . 33

4.4.1 Measured temperature before and after calibration. . . . . . . . . . . 37

4.4.2 Thermocouples response to temperature changes. . . . . . . . . . . . 38

4.4.3 Measured temperature before and after filtering. . . . . . . . . . . . . 38

4.4.4 Positioning of the shunt resistor. . . . . . . . . . . . . . . . . . . . . . 39

4.4.5 Measured current of a single solder paste shot in a BGA job. . . . . . 39

4.4.6 Measured noise from the current sensing. . . . . . . . . . . . . . . . . 40

4.4.7 Calibrated measurement of current. . . . . . . . . . . . . . . . . . . . 40

viii

LIST OF FIGURES

4.4.8 Verification and validation of the LSTMmodel. . . . . . . . . . . . . . 41

5.1.1 Training and validation loss when training the LSTM model using a

BGA job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.1.2 Predicted and true diameter using a BGA job with current data as input. 45

5.1.3 Predicted and true diameter using a BGA job with three different

variable configurations as input. . . . . . . . . . . . . . . . . . . . . . 45

5.1.4 Predicted and true diameter using a BGA job with all input variables

given to the model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.1.5 Predicted and true distribution using a BGA job. . . . . . . . . . . . . 46

5.2.1 Training and validation loss when training the LSTM model using an

RT1 job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.2.2 Predicted and true diameter using an RT1 job with all input variables

given to the model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2.3 Predicted and true distribution using an RT1 job with all input

parameters to the LSTMmodel. . . . . . . . . . . . . . . . . . . . . . 48

A.0.1 PCB Schematic Overview. . . . . . . . . . . . . . . . . . . . . . . . . . 66

B.0.1 Training and validation loss for different input parameter

configurations using a BGA job. . . . . . . . . . . . . . . . . . . . . . 67

C.0.1 Results from the predictions by the LSTM model when given only the

temperature as input when using a BGA job. . . . . . . . . . . . . . . 69

C.0.2 Results from the predictions by the LSTM model when given only the

temperature as input when using an RT1 job. . . . . . . . . . . . . . . 69

ix

List of Tables

4.2.1 Architecture of prediction model. . . . . . . . . . . . . . . . . . . . . . 34

4.3.1 Variable configuration for the BGA and RT1 job . . . . . . . . . . . . 35

4.3.2 Evaluation cases for the LSTMmodel, where a BGA job with different

input configuration is fed to the model. . . . . . . . . . . . . . . . . . 36

5.1.1 BGA test results for different sensor combinations. . . . . . . . . . . . 44

5.2.1 RT1 test results for selected sensor combination. . . . . . . . . . . . . 47

C.0.1 Test results when only using temperature data. . . . . . . . . . . . . . 68

x

List of Abbreviations

A/D Analog-to-Digital

ASIC Application Specific Integrated Circuit

BGA Ball Grid Array

BNC Bayonet Neill–Concelman

D/A Digital-to-Analog

EMI Electromagnetic Interference

FPGA Field-Programmable Gate Array

GPIO General-Purpose Input/Output

GPU Graphics Processing Unit

I2C Inter-Integrated Circuit

IC Integrated Circuit

IDC Insulation-Displacement Contact

LSTM Long Short-Term Memory

MAE Mean Absolute Error

MRE Mean Relative Error

PCB Printed Circuit Board

PnP Pick-and-Place

PRT Platinum Resistance Thermometer

RNN Recurrent Neural Network

RT1 Robustness Test 1

xi

LIST OF TABLES

SGD Stochastic Gradient Descent

SMD Surface Mount Device

SMT Surface Mount Technology

SPC Statistical Process Control

SPI Serial Peripheral Interface

UART Universal Asynchronous Receiver/Transmitter

xii

Chapter 1

Introduction

This chapter introduces the project, its purpose and its framework. Moreover, it

introduces the company where the master’s thesis project has been conducted, namely

Mycronic, as well as provides a brief overview of their technology.

1.1 Background

Surface mount technology (SMT) is a technology used for producing printed circuit

boards (PCBs), which started to become widely used in the 1980s. Instead of having

components with leads that go through the PCB, so called through-hole components,

surface mount devices (SMDs) are used. SMT is used in virtually all commercial

production of circuit boards due to advantages in regards to size, cost, reliability and

automatability.

There are different ways of applying solder paste for SMD components, one of

which is jet printing. This method utilizes a piezoelectric actuator to operate a piston

which ejects solder paste out of a nozzle. The solder paste dots are ejected at high

frequencies and typically have a volume of 5-20 nl. An example of what a printing

head, also called ejector, can look like can be seen in Figure 1.1.1.

1

CHAPTER 1. INTRODUCTION

Figure 1.1.1: Ejector of a jet printing machine [1].

One company which uses this type of technology is the Swedish manufacturer

Mycronic. Mycronic is a high-tech company that has been producing world-leading

production equipment for display and electronicmanufacturing since the early 80s [2].

In addition toMycronic’smaskwriter, they also offer complete assembly line solutions,

which include the jet printing machine and the Pick-and-Place (PnP) machines [3].

This project was focused on their technology within SMT and, more specifically, the

jet printing machine which is the second unit in the assembly line as seen in Figure

1.1.2.

Figure 1.1.2: An assembly line solution at Mycronic [4].

2


1.2 Problem Formulation

A jet printingmachine deposits solder paste or other assembly fluidswith high accuracy

and good repeatability, but due to unknown factors the quality of the ejected solder

paste droplets can still vary between shots. Thus, it is of interest to monitor variables

of the jet printing machine using a neural network in order to gain more knowledge

about the cause of the varying quality.

The manufacturers of circuit boards have an increased demand on increasing

the density and the complexity in today’s technology. The traditional technique of

detecting defects in PCB production is through statistical process control (SPC), but

it is used off-line which means that defects are detected after the completed process

[5]. In [5], [6] and [7] it is mentioned that real-time detection of process drifts is

preferred in order to make production more efficient and robust. In [5], it is stated

that a neural network has advantages in both accuracy and robustness in the field of

modelling semiconductor processes. It is further explained in [5] that a neural network

can learn to map complex sequences and handle corrupted data.

In order to implement a neural network for predicting the quality of solder paste

deposits, it first has to be decided what variables should be monitored. The choice of

variable is dependent on the availability of the signal and its probability to predict a

certain behaviour. One possibility is temperature. When the temperature of solder

paste varies, so do its rheological properties [8], [9]. Different components and

mechanisms in the ejector are affected depending on how much the temperature

changes during a printing job. A previous study mentions that an increase in

temperature will decrease the viscosity, which will affect the quality of the printing

negatively [9]. Therefore, a possible hypothesis is that data gathered from temperature

sensors in the ejector can be used as an input to a neural network that could be used to

predict changes in jetting quality. The place where the temperature is measured may

have an influence on the performance of the neural network, and therefore three sensor

positions will be compared in this project. These three temperature sensor positions

are based on sensor positions used in a previous master’s thesis at Mycronic [4].

A stepper motor controls the Archimedes screw which feeds the ejector with solder

paste, as seen in Figure 1.1.1. The efficiency of the stepper motor is highly dependent

on the solder paste properties, such as the viscosity of the fluid. Before each shot,

the screw is turned a certain number of degrees and if the temperature changes, the

volume being fed will also change to some extent. The temperature here is affected

by heat generated in the ejector, friction in the paste screw, as well as heat generated

by the stepper motor. Therefore, one temperature sensor was placed after the paste

screw, which is shown as Sensor 1 in Figure 1.2.1. The solder paste being fed to the

screw comes from a tube which has been stored in a refrigerator, and once mounted in

3


the machine is slowly heated up to room temperature before jetting starts. Measuring

the temperature of the paste being fed was therefore also of interest, since this could

also affect the pump. This position is named Sensor 2 in Figure 1.2.1.

As for the piezo-controlled piston, the temperature can have multiple effects. The

rheological properties are a function of temperature. Changes of those properties

will affect how easy or hard it is for the piston to push the paste through the ejector

nozzle, but also the number of undesirable air pockets that are created in the chamber.

These air pockets change how much solder paste is ejected and at what speed, thereby

decreasing the quality. A temperature sensor was therefore placed in the chamber next

to the piston, presented as Sensor 3 in Figure 1.2.1. As changes in temperature have

an impact on the jetted quality due to changes in viscosity [9], having a sensor in this

location was also of interest. The properties of the paste here affect how the paste

exits the nozzle, that is, if there is a tendency for satellites, if the positioning is good,

etc. Satellites are small undesirable solder paste droplets that break off from the main

deposits. This is illustrated in more detail in Section 2.1.2.

Figure 1.2.1: Temperature sensor placement [4].

The displacement of the piezoelectric actuator used in the ejector is controlled

by a voltage reference which follows a predetermined curve. Collected data from

the measured voltage curve, such as rise time and amplitude, could be used to

train the neural network to predict the quality of the jetted solder paste. However,

Mycronic considers the supply voltage to the actuator to be too noisy for qualitative

measurements [4]. An alternative way of measuring the displacement of the

piezoelectric actuator is tomeasure the current required to follow the voltage as seen in

Figure 1.2.2, whichwas performed in [4]. It isMycronic’s hypothesis that themeasured

current variations expresses individual droplet characteristics [4]. The data gathering

from the ejector can also be performed in a non-invasive way.

4


Figure 1.2.2: Theoretical graph of the current and voltage waveform sequence for anindividual jetted solder paste droplet [4].

1.3 Research Question

The research question for this master’s thesis is as follows:

In a piezo-based material depositing device, what are the implications

of the predetermined temperature sensor positions, when providing

supporting data from a current sensor, in regard to increasing the

accuracy of predicting jetted solder paste quality by training a neural

network?

1.4 Requirements

The requirements for this project were decided together with the stakeholders and are

listed below.

• A neural network shall be trained to predict changes in the quality of jetting

deposits which later can be used for real-time prediction.

• Temperature and current shall be measured and the data shall be used as input

to the neural network.

• Three different locations on the ejector shall be used for temperature

measurements.

5


• The quality of the solder paste shall be based on the diameter of the shots and

these shall be measured using a MY700 jet printer.

• Acceptable results from the neural network require the predicted diameter to

vary less than 8% from the actual diameter for individual predictions.

1.5 Delimitations

The delimitations of the project are listed below.

• The MY700 shall be used to gather data for the neural network.

• The focus of the project shall be to only evaluate correlations between two

predetermined input parameters and one output parameter.

• The two inputs shall be temperature and current, while the output shall be the

quality of jetting deposits with respect to diameter.

• The project shall only include two different types of jetting jobs on the MY700,

which are a Ball Grid Array (BGA) job and a Robustness Test 1 (RT1) job.

• The study shall be performed on one machine and one ejector only.

• The neural network shall be trained off-line so that, at a later stage, it can be used

for real-time prediction.

1.6 Methodology

The methodology used in this project was a case study. Two different jet printing jobs

were performed with different conditions for the neural network to be trained on. The

argument for using a case study in this project, and a more detailed explanation of the

jet jobs can be found in Chapter 3.

Theworkload between us has been equally shared in this project and both of us have

been working on the software and the hardware. The main purposes of doing equally

much in all areas were that the both of us should gain knowledge in all fields and that

we could easily share thoughts and ideas during the development of the project. We

have followed the Scrum teammanagement framework in order to assign weekly tasks

and easily perform a follow-up.

6


1.7 Thesis Outline

Chapter 1 provides an introduction to the project and its purpose, as well as an

overview of the company where the master’s thesis has been conducted. Furthermore,

it describes the importance of this project in order to create a framework to further

understand the changes in the quality of jetting deposits. Chapter 2 summarizes the

literature study about the ejector technology, the properties of solder paste, the sensors

used and different types of architectures of neural networks. Chapter 3 explains the

chosen methodology for this project, as well as alternatives. Furthermore, it discusses

the internal and external validity and the procedure of this project. Chapter 4 explains

the procedure of implementing the sensors and extracting the data to the neural

network, as well as the design of it. This chapter also explains the verification and

validation process of each component of the project. The results of howwell the neural

network could predict the quality of the ejected droplets are presented in Chapter

5. Chapter 6 and 7 discuss the findings and link back to the research question and

requirements. Improvements and future work are presented in Chapter 8.

7


8

Chapter 2

Frame-of-Reference

This chapter summarizes the literature study about the ejector technology, the

properties of solder paste, the sensors used, the data acquisition and different types

of architectures of neural networks.

2.1 Jetting Technology

A simplified illustration of an ejector, in a jet printingmachine, is shown in Figure 2.1.1,

where some key components are highlighted. However, a missing key component in

Figure 2.1.1, but which is shown in Figure 1.1.1, is the Archimedes screw which feeds

the solder paste into the chamber from the container.

Figure 2.1.1: Simplified illustration of the ejector in a jet printing machine [4].

In the ejector, which is used in the Mycronic MY700, an Archimedes screw feeds

the chamber with solder paste in a controlled way from the container with solder

9

CHAPTER 2. FRAME-OF-REFERENCE

paste. The piezo expands as voltage is applied to it, causing the piston to move. The

momentum from the piston is transferred into the solder paste in the chamber and the

material is ejected out from the nozzle. The volume of the solder paste in the chamber

is accurately controlled by the Archimedes screw. As the voltage level drops, the piezo

volume is reduced and the spring moves the piston to its initial position [4]. As the

piston moves back up, the process is repeated. The jet printers can eject solder paste

droplets at a frequency of up to 300 Hz, and the volume of the droplets is measured in

nanoliters [10], [11].

2.1.1 Piezoelectric Actuator

In a piezoelectric actuator there are piezoelectric crystals, forming a ceramic, which

expand as voltage is applied and vice versa. Thus, electrical energy is converted to

mechanical displacement. If an alternating voltage is applied to thematerial, it changes

its dimensions cyclically at the frequency of the applied voltage. The frequency atwhich

the piezo most efficiently converts the electrical energy to mechanical displacement

is at its resonant frequency, which is where the impedance is the lowest [12]. The

resonant frequency is determined by the composition of the piezoelectric crystals, as

well as the shape and volume. The main advantages of a piezoelectric actuator is that

it has high precision [12], [13], high force, fast response time [14] and fast acceleration

[15]. A drawback is that they can be affected by hysteresis, that is, that the history of

the electric field, stress and displacement can cause nonlinearity [16].

There are two different types of piezoelectric actuators: stack and stripe [12]. The

stack piezo uses multiple stacked layers of piezo elements and each of these give a

combined effect on the displacement generated from the elements [15], which is shown

in Equation 2.1. Furthermore, the displacement of the stacked piezo is about 0.1–

0.15% of its total length. However, if the path of displacement is blocked, a force is

applied to the blocking object. The movement of a stacked piezo actuator is defined by

∆L = n · d33 · V, (2.1)

where n is the number of stacked piezo elements, d33 is the piezoelectric coefficient

and V is the voltage applied. The stacked piezo actuator can be divided into two

different categories, which are either high or low voltage. The low voltage is rated for

an operating voltage up to 200V and the high voltage is rated for an operational voltage

up to 1000 V. A stacked piezo actuator is categorized as either high or low voltage

depending on the thickness of the piezo element. The thicker the piezo element is,

the higher voltage it can operate at [15]. The stacked type of piezo actuator is used in

the ejectors in Mycronic’s jet printing machines.

10


A striped piezo actuator is configured with two stripes of piezo elements in an

orientation such that when voltage is applied, one of them contracts and the other one

expands [17]. This causes the striped piezo actuator to flex. However, this type is not

used in the ejectors in Mycronic’s jet printing machines.

In Mycronic’s jet printing machines, the stacked piezo actuator is controlled by a

multi-phase-waveform voltage level. An example of a simplified three-phase voltage-

time waveform is shown in Figure 2.1.2.

Figure 2.1.2: Three-phase voltage waveform of the piezo actuator for a single ejectedsolder paste droplet.

2.1.2 Jet Printing Quality

At Mycronic, there are different ways of analyzing the jet printing jobs of the MY700

machine. One option of test jetting that is frequently used is to perform a BGA test,

which entails producing deposits for generic BGA components. This test deposits a

pattern of squares where the dots can have different sizes and distances between them

[4], [18]. Figure 2.1.3a illustrates an approved jet printing job, while Figure 2.1.3b

shows a faulty job. A faulty job canbe confirmed if the deposits of the jobhave an erratic

positioning or size, contain bridges of solder paste between the deposits or contain

satellites. Another type of test is called RT1. This test shoots 12-dot strips with varying

diameters and frequencies. The finished test boards are analyzed by the machine

from which quality measurements such as diameter, satellites, area, positioning and

shape canbe extracted. Another solder paste inspectionmachine can extract additional

quality measurements for each deposit, such as volume. As seen in Figure 2.1.3a, an

approved job has few satellites, consistent pattern and accurate shape. In Figure 2.1.4,

an image of an individual ejected droplet is shown alongwith the presence of a satellite.

11


(a) (b)

Figure 2.1.3: Two different qualities of a BGA jet printing job [4]. (a) Accepted job. (b)Faulty job.

Figure 2.1.4: Qualitymeasurement of a single droplet on a substrate. Positioning error,area, shape and satellites are illustrated [4].

2.2 Non-Newtonian Fluids

Fluids can be divided into two categories: Newtonian and non-Newtonian. A

Newtonian fluid follows Newton’s law of viscosity, that is, that the viscosity of the

fluid is independent of the shear rate [19]. Generally, the viscosity of a Newtonian

fluid is constant at a given temperature and pressure, and examples of such are air

and water. Not all fluids follow Newton’s law of viscosity, and these fluids are referred

to as non-Newtonian. These fluids display a more complex behaviour as they do not

have a constant viscosity at a given temperature and pressure. Instead, the viscosity

is dependent on the flow conditions, such as shear rate, flow geometry and even

kinematic history in certain cases [20].

The study of deformation and flowofmaterial is called rheology. There are different

types of non-Newtonian fluids which have different rheological properties, and they

can be divided into four categories: pseudoplastic, dilatant, thixotropic and rheopectic

fluids [21]. The behaviour of these can been seen in Figure 2.2.1. In a Newtonian fluid

the viscosity, defined as shear stress divided by shear rate, is constant and is therefore

represented by a linear relationship in the graph.

12


Figure 2.2.1: Rheological properties of different non-Newtonian fluids [20].

Pseudoplastic fluids are shear thinning, meaning that as the stress increases, the

viscosity decreases. An example of a shear thinning fluid is ketchup. A similar variant

is yield-pseudoplastics which behave like pseudoplastics, but only after a certain yield

stress. Dilatant fluids are shear thickening and behave the oppositeway comparedwith

pseudoplastics,meaning that the viscosity increases as the stress increases. Cornstarch

mixed with water, also known as oobleck, is an example of this. Both thixotropic and

rheopectic fluids are time-dependent. The viscosity of thixotropic fluids decreaseswith

stress over time and with rheopectic fluids it increases [21]. Examples of a thixotropic

and rheopectic fluids are solder paste and printer ink, respectively.

2.2.1 Solder Paste

Solder paste is a fluid which is composed of amixture ofmetal solder powder, a binder,

flux and other rheological components. The solder particles typically have a diameter

between 10 and30µm in jet printing applications and are produced to be as spherical as

possible [22]. Different alloy types can be used for the solder powder depending on the

application. The binder is used to keep the paste from separating and the flux removes

the oxide layer between the metal and solder as well as accelerates the wetting of the

metal [23]. The composition of solder pastes affects their rheological properties, and

the exact composition is generally not disclosed by the companies that produce them.

In order to provide a solder paste to Mycronic’s customers that fit their application,

they cooperate with other companies that produce solder pastes [4].

The composition of solder paste gives it a non-Newtonian behaviour. When it is

exposed to a shear stress, it exhibits a thixotropic behaviour, or in other words, the

viscosity decreases over time. Using shear sweeps, Mycronic has tested two solder

13


paste samples for shear viscosity as a function of shear rate which can be seen in

Figure 2.2.2. The viscosity decreases as the shear rate increases and this confirms the

thixotropic behaviour. One reason for this behaviour is that when no shear stress is

applied, attractive forces between the metal particles create flocs of particles which

increases the viscosity. As shear is applied these flocs break apart which decreases

the viscosity of the paste. Once the shear is removed flocs begin forming again and

the viscosity increases. However, the structure of the flocs might change which would

mean that viscosity does not fully return to the same state as before [24].

Figure 2.2.2: Shear viscosity as a function of shear rate for two solder paste samples[4].

2.3 Sensors

This section describes the sensors that were considered for data gathering in the

project.

2.3.1 Temperature

A common choice for temperature measurements is a thermocouple. Thermocouples

work by having a closed circuit of two dissimilar metals, as can be seen in Figure 2.3.1.

If there is a difference in temperature between the two junctions of the thermocouple,

a voltage will be produced between the two metals due to the thermoelectric effect,

which can be measured at one of the junctions [25]. This voltage can then be used to

determine the temperature at the opposite junction. The combination of metals used

in the sensor affects the voltage produced, and this can vary between sensors. The

14


main advantages of thermocouples are that they are robust, relatively inexpensive, can

measure a wide range of temperatures and are self-energized. The disadvantages with

the sensors are that the signal is weak which makes them sensitive to electrical noise

and also that the output is non-linear and requires amplification. Two other types of

temperature sensors are platinum resistance thermometers (PRTs) and thermistors.

The basic principle for both of these sensors is that their resistance is dependent on

temperature. However, PRTs aremore expensive than thermocouples and thermistors

cannot measure as wide of a temperature range [26].

Figure 2.3.1: Simple thermocouple circuit [25].

2.3.2 Current

There are several principles for measuring current, but the most common method

is using a shunt resistor [27]. A shunt resistor is a low resistance resistor used for

determining the current through the resistor by measuring the voltage drop over it.

Ohm’s Law states that

V = I ·R, (2.2)

where V is the voltage drop over the resistor, I is the current through it and R is

the resistance. This means that the voltage changes proportionally with the current.

The advantages of shunt resistors are that they are inexpensive, robust and have high

accuracy. Some things to be aware of when using them are that there is a power loss

which is proportional to the square of the current, which means that they are generally

not suitable for measuring high currents. Furthermore, the resistance could vary due

to factors such as aging or changes in temperature, which affects the precision of the

measurement [28]. Other methods of measuring current include using Hall effect

sensors to measure changes in the magnetic field created by the current, as well as

using sensors based on Faraday’s Law where transformers are utilized.

2.4 Data Acquisition

A powerful measurement tool that is able to make multiple measurements

simultaneously and features similar standard as laboratory equipment is the Red

15


Pitaya. TheRedPitaya is a single board computerwhich is intended to be an alternative

to the more expensive laboratory equipment. It is an open-source instrumentation

platform that can measure or test a variation of tasks [29]. The Red Pitaya has a built

in signal generator and pre-developed apps can be downloaded from the web page or

one can develop one’s own apps [30]. Depending on the version, there are two 14-bit

or 10-bit analog-to-digital (A/D) and digital-to-analog (D/A) converters on the board

that can measure tasks at a sampling rate of 125 MHz [29]. These fast input channels

have a bandwidth of 50 MHz. The Red Pitaya also has two extension connectors,

which have access to four slow analog inputs, four slow analog outputs, 16 General-

Purpose Input/Output (GPIO), Inter-IntegratedCircuit (I2C), Universal Asynchronous

Receiver/Transmitter (UART) and Serial Peripheral Interface (SPI) [29], [31]. These

slow input channels have a bandwidth of 50 kHz. So, the Red Pitaya is a useful

measurement tool if there is a demand of high performance signal processingwith high

frequency signals of up to 50 MHz [29]. Figure 2.4.1 shows the hardware overview of

the Red Pitaya where some components are highlighted.

Figure 2.4.1: Hardware overview of the Red Pitaya [31].

The Red Pitaya features the Xilinx Zynq 7010, which is pointed out in Figure 2.4.1.

This system combines a Field-Programmable Gate Array (FPGA) and a multi-core

processor. The advantage of FPGAs is that they can be reprogrammed for a desired

task after it has been manufactured [32]. In other words, an FPGA that is working as

a microprocessor can, for example, be reprogrammed to work as a graphics card. The

more common technology is the Application Specific Integrated Circuit (ASIC) where a

component is designed for only one purpose throughout its lifetime [32]. One example

of that technology is the graphics processing unit (GPU) inside amodern phone, where

the logic cannot be reprogrammed to work as another component.

16


2.5 Neural Network Architecture

The main objective of a neural network is to recognize patterns. This is made possible

by first having the neural network learn from a series of defining sets of input and

output correspondences. The neural network can then apply what it has learnt to new,

and unseen, input data to predict a relevant output [33]. A typical neural network is

seen in Figure 2.5.1. The structure of neural networks consists of an input layer, one or

more hidden layers, an output layer and interconnections between nodes of different

layers.

Figure 2.5.1: Typical architecture of a fully connected neural network with one hiddenlayer.

The training process of a neural network can be divided into two categories:

forward-propagation and back-propagation. During forward-propagation the

information is sent through the neural network and a prediction is made. The process

from the input layer to the output layer is such that the input layer first receives

information from an external source. That information is passed, via the connections,

to nodes of the hidden layer, which processes all the information. Lastly, the output

layer receives the processed datawhich is given to the user. The path of the information

from the input layer, through the hidden layer, to the output layer is determined by the

strength of the interconnections between nodes. Each node has a set of weights which

determines the importance of its inputs, as well as a bias which adjusts the output.

When a node in the input layer receives information, it is activated. That triggers

a signal by the activation function to be emitted to its neighbouring nodes. This signal

is either excited or inhibited depending on the strength of the interconnection, that is,

17


themagnitude of the weights and biases. This process continues on through the neural

network, which creates a pattern of activation that manifests itself in the output layer

[33]. The forward-propagation in a neural network is defined, mathematically, by

a(l) = g(a(l−1);Θ), (2.3)

where g is the activation function, a is the preactivation, l denotes the layer and Θ

represents the parameters, or in otherwords, theweights and biases. The preactivation

is a weighted sum of the inputs to the layer. Figure 2.5.2 shows the principle of

actions in an artificial neuron. First, the weighted sum of the input parameters,Θn, is

calculated and then passed through an activation function, g.

Figure 2.5.2: Structure of an artificial neuron in a neural network.

Examples of activation functions are sigmoid, tanh and linear, which are defined

by Equation 2.4, Equation 2.5 and Equation 2.6 respectively.

g(z) =1

1 + e−z(2.4)

g(z) = tanh(z) (2.5)

g(z) = z (2.6)

The first two activation functions are non-linear functions, whose purpose are to

introduce non-linearity into the neural network. Equation 2.6, on the other hand, is

a linear activation function. If only linear activation functions are used in the hidden

layers of a neural network, the output will just be a linear transformation of the input.

In other words, a composition of successive linear transformations is equivalent to

one linear transformation, which means that complex non-linear problems cannot be

accurately mapped between input and output. Moreover, most real world problems

are highly complex and non-linear, which is why non-linear activation functions are

required in, at least, the hidden layers of a neural network. However, a linear activation

18


function can be used in the output layer if a continuous value shall be predicted.

When this process is finished and the error of the prediction, that is the loss, has

been calculated, the model has done its forward-propagation. However, in order to

learn, that is, update its weights, back-propagation is needed. The purpose of back-

propagation is to minimize the error that is propagated from each node to the total

error [33]. This is made possible by a technique named gradient descent, which tunes

the weights in order to minimize the loss function which evaluates how the model is

performing. Minimizing the loss function is, thus, an optimization problem in terms of

tuning the weights of the neural network. Since the loss function is a summation of the

prediction errors by the neural network, the lower the loss, the better the performance

of the neural network. Examples of methods to calculate the loss are mean absolute

error (MAE) and mean relative error (MRE). These two methods are defined as

MAE =1

n

n∑i=1

|yi − yi|, (2.7)

MRE =1

n

n∑i=1

|yi − yi|yi

, (2.8)

respectively, where n is the total number of data points, y is the true value and y is the

predicted value.

At each update, how much to modify the model with respect to the estimated error

is determined by the learning rate. An excessive learning rate can cause an unstable

training process, whereas a rate that is too low will require a longer training process.

Thus, the main idea behind training neural networks is to minimize the loss function

bymodifying the parameter of themodel and in turnmaximizing the accuracy [34]. As

one iteration of forward- and backward-propagation is completed, the neural network

has completed one epoch of training.

While training neural networks, the model is likely to overfit if there is no

regularization. When overfitting, the model performs well on the training data but

poorly on the new, unseen, data. This can be seen as a decreasing training loss,

but constant or increasing validation loss while training. This means that the neural

network has only memorized the training data rather than generalized on new data.

To minimize the risk of overfitting, different regularization techniques could be used,

such as dropout or L2 regularization. Implementing dropout randomly removes

connections in the neural network during training. Thus, the neural network cannot

rely on the connections between nodes, which prevents it from overfitting. The L2

regularization method dynamically penalizes the weights, such that large weights are

penalizedmore and vice versa. Aswith dropout, the L2 regularization also decorrelates

the neural network.

19


2.5.1 Recurrent Neural Network

Recurrent neural networks (RNNs) are neural networks that are specialized in

processing sequences of datawhich canhave variable lengths [35]. Themain difference

between the structure of anRNNandother neural networks is that the nodes of anRNN

have a recurrent connection, which stores previous calculations and, thus, functions

as a memory. This results in the RNN having two inputs, the present and the recent

past. The additional input about the past holds valuable information about the future

[36]. Figure 2.5.3 shows a cell in an RNN, which has the recurrent connection that is

different from other neural networks, as seen in Figure 2.5.1.

Figure 2.5.3: Circuit diagram of a cell in an RNN. Here, xt is the input at time t, ht isthe state of the hidden layer at time t and ot is the output at time t. Parameters for theinput, hidden layer state and output areΘi,Θh andΘo, respectively [37].

The graphical model of an RNN cell in Figure 2.5.3 can be explained with the

following equations:

ot = f(ht;Θ), (2.9)

ht = g(ht−1,xt;Θ). (2.10)

In Equation 2.9 and Equation 2.10, ot is the output of the RNN at time t, f and g

are activation functions, ht is the state of the hidden layer at time t, xt is the input

at time t andΘ represents the weights and biases. Equation 2.9 shows that the output

is dependent on the weights and biases and also the state of the hidden layer at time t.

However, Equation 2.10 shows that the state of the hidden layer at time t is dependent

on the weights and biases, input at time t and the state of the hidden layer at time t−1.

The latter equation is what differentiate RNNs from other neural networks since the

previous state of the hidden layer, ht−1, has influence on the current state of the hidden

20


layer, ht [37]. This demonstrates that the RNNs have a memory.

However, there are two major drawbacks of the RNN architecture: vanishing and

exploding gradients. Both of these issues can occur only during the back-propagation

phase if there are long-term dependencies, that is, it has to memorize a long sequence.

So, the vanishing or exploding gradients occur due to multiplication in the chain rule

of the partial derivatives in the back-propagation through time [37]. Gradients that

are less than one shrink exponentially due to continuous matrix multiplication until

the gradients vanish. The same applies for the exploding gradients when the gradients

are greater than one, but then the gradients start increasing and eventually cause a

numerical overflow. A solution to this issue is to choose an alternative recurrent neural

network, namely long short-term memory (LSTM).

Long Short-Term Memory

The LSTM architecture is a gated version of the RNN architecture, which addresses the

issue of long-term dependencies [35], [38]. This implies a more complicated structure

of the cell than in RNNs. As seen in Figure 2.5.4, the cell consists of three different

gates: a forget gate, an update gate and an output gate. The gates consist of either a

sigmoid or a tanh function (see Equation 2.4 and Equation 2.5 respectively) in order to

control the flow of information through the LSTM cell. These two types of activation

functions in the LSTM cell also introduce non-linearity to the neural network.

Figure 2.5.4: Illustration of the data flow through an LSTM cell. The three differentgates are highlighted: input gate, update gate and output gate [39].

The inputs of the LSTM cell are the current input, xt, the previous hidden state,

ht−1, and the previous memory state, ct−1. The outputs are the current memory

state, ct, and the current hidden state, ht. The core concept of an LSTM cell is that

information can be passed forward on the cell state memory line, shown as the top

horizontal line in Figure 2.5.4, and information can either be removed or added by

21


the forget and update gates, respectively. This enables relevant information to be

transferred, touched or untouched, throughout the processing of the sequence, which

addresses the problem of long-term dependencies with RNNs [35]. Figure 2.5.4 shows

one ofmany cells that can be connected in series, which can be simplified by a recurrent

connection as in Figure 2.5.3. Thus, the illustration in Figure 2.5.3 can be extended by

removing the recurrent connection and adding as many cells as the length of the input

sequence in series. The cell outputs in Figure 2.5.4 can be expressed mathematically

by:

ct = ft ⊙ ct−1 + it ⊙ gt, (2.11)

ht = ot ⊙ σc(ct) (2.12)

where the forget gate, ft, the update gate, it and gt, and the output gate, ot, are defined

as

ft = σg(ht−1,xt; Θf ), (2.13)

it = σg(ht−1,xt; Θi), (2.14)

gt = σc(ht−1,xt; Θg), (2.15)

and

ot = σg(ht−1,xt; Θo), (2.16)

respectively, where σg is the gate activation function and σc is the state activation

function.

Different types of LSTM models include vanilla, stacked, encoder-decoder and

bidirectional. Vanilla LSTMs are often referred to as the default or standard version

of the architecture and consists of an input layer, one fully connected hidden LSTM

layer and a fully connected output layer. This is the simplest version of an LSTM and

is generally a good starting point for solving a problem. Models with more than one

hidden LSTM layer are referred to as stacked LSTMs. The advantage of having more

than one layer is that it improves the success of a neural network. Additionally, having

several small layers is generally more efficient than having one large layer [40]. The

layout for these two models can be seen in Figure 2.5.5a and Figure 2.5.5b.

The encoder-decoder model is useful for sequence-to-sequence problems, that is,

when the input is a sequence of values and the goal is to predict the coming sequence

of values. This architecture contains an encoder model which processes the input and

encodes it into a vector with fixed length. This vector is then given to the decoder

model which decodes the vector and gives the predicted sequence. Themain use of the

22


architecture is natural language processing and text translation, its layout can be seen

in Figure 2.5.5c. The encoder-decoder model has been found to occasionally be more

efficient when the input is reversed, and this phenomena is utilized in the bidirectional

model. In thismodel the input is fed to two layerswhich are side-by-side, as can be seen

in Figure 2.5.5d. The forward input sequence is given to the first layer and a reversed

version of the input sequence is given to the other layer. This method has been known

to increase the performance of a neural network, but it does require that the entire

input is available [40].

(a) (b) (c) (d)

Figure 2.5.5: LSTM models [40]. (a) Vanilla LSTM. (b) Stacked LSTM. (c) Encoder-decoder LSTM. (d) Bidirectional LSTM.

23


24

Chapter 3

Methodology

The following chapter presents the methodology used in this project. The chosen

research strategy, the internal and external validity, as well as an overview of the

procedure used are presented and discussed.

3.1 Research Strategy

It is of importance to choose a methodology in research since it explains what type of

systematic approach that is being used to solve the problem. In other words, a work

plan to address the research problem by defining the procedure of methods by which

knowledge is obtained [41]. The methodology also defines the quality assurance of the

project, that is, the validation and verification of the research material [42].

The methodology used in this project was an empirical quantitative research

approach, utilizing a case study as the research strategy. In [43], the purpose of

quantitative research is defined to study relationships, cause and effect. It is also

mentioned in [42], [43] and [44] that quantitative research is characterized by large

data sets. Considering that the purpose of this project was to gather large amounts

of data and to create a neural network to examine the effect certain parameters have

on the quality of jetting deposits, a quantitative research approach was deemed to be

suitable.

A case study was chosen due to its usefulness when doing an empirical study of a

particular phenomenon using multiple sources of evidence [42]. The phenomenon to

be studied in this projectwas how themeasured temperature andpiezo current affected

the accuracy in predicting the quality of jetting deposits. In [45], it is stated that a case

study is beneficial if knowledge shall be obtained regarding a new phenomena. This

project is the first at Mycronic that investigates the potential benefits of applying a

neural network in their jet printing machine to predict the quality of the deposits. The

use of a case study is also supported by [46], in which it is stated that a case study will

25

CHAPTER 3. METHODOLOGY

give indications on hypothesis creation. Since the intended purpose of this project was

to create a hypothesis regarding how the defined sensors improved the ability to predict

quality in jet printing machines, a case study was appropriate for achieving this. Using

the knowledge gained in this project, the created hypothesis can be examined further

in future projects, which is discussed in Chapter 8.

An alternative research method is using experiments. Experimental methods deal

with the relationship and effects between variables as they aremanipulated [42]. Since

it is not within the scope of this project to manipulate variables, such as temperature

or current, an experimental method was not chosen. It is also mentioned in [47]

that an advantage with using case studies over experiments is that only naturally

occurring cases are investigated, rather than cases created by the researcher. Thus,

the realism of the project can be assured by using a case study. Additionally, the causal

hypotheses generated by case studies can sometimes enable researchers to recognize

causal relationships in a way that is not possible in experimental research [47].

There were two different types of jet printing jobs used for collecting the data, both

of which are frequently used byMycronic’s test engineers. Onewas a simpler job, while

the other was amore complex and realistic job. The simpler one was a BGA test, where

the shots had a constant diameter and were shot at a constant frequency. Since a

majority of jobs used by customers do not have constant diameters and frequencies,

this job was considered to be a simple test. The other job was the RT1 job, where the

diameter of the shots varied between five different values and the frequency between

three different values. Both tests are explained in more detail in Section 2.1.2. These

tests are used by Mycronic to evaluate the robustness of the machine, but they can

also be considered to be realistic tests since they simulate the machine usage by the

customers. Varying variables are more difficult for the neural networks to predict. By

performing these two types of jobs, a clearer and more unambiguous evaluation of

the neural network could be made. Since the tests were performed on a real machine

using real jet jobs used by test engineers, the tests could be generalized for the given

combination of ejector and solder paste used in this project.

3.2 Internal and External Validity

In order to provide an answer to the research question, this project used a case study

which was split into two different tests with different amounts of varying variables.

The two cases were chosen such that the performance of the neural network could be

verified and also so that the realism and internal validity of the project would increase.

As mentioned in [48], the internal validity of a project can increase if multiple data

sources of the same method are used. By using the two different jobs, the project

26


covered a broader range of different usages of the MY700. The BGA job was useful to

first find potential correlations between the sensor data and the quality of the deposits,

while the RT1 both checked the robustness of the neural network and simulated the

real environment it would be active in. Another factor that was important for the

internal validity of this project was that the created neural network was capable of

finding correlations in the data if there were any. The ability of the neural network

to recognize patterns was verified and is described in Chapter 4, but it is possible that

other architectures than the one chosen would be even better at finding correlations.

In [49], it is mentioned that external validity relates to how generalizable a study

is. After discussion with the stakeholders at Mycronic, it was decided that the main

focus of the project was hypothesis creation and to create a framework upon which

further studies can be performed. Due to time and resource limitations, it was not in

the scope of the project to evaluate with other hardware setups, such as using different

machines, ejectors or solder pastes. This could however be done in future projects to

further examine the generalizability of the results and to improve the external validity.

3.3 Procedure

The first phase of the project was to create the project formulation, decide on a

methodology and create a project time plan. After that, a background study about

ejector technology, the sensors that were used and neural network architectures was

done. Since data collection was one of the main components of this project, it was of

high importance to carefully study both what data points to get and also how to process

them. After the prestudywas finished, the test setupwas created anddatawas gathered

from a MY700. The temperature sensor data was gathered from three different places

on the ejector, as explained in Section 1.2 and shown in Figure 1.2.1, and a shunt

resistor was used to measure the current through the piezoelectric actuator. When

this configuration had been built, the two types of jobs explained above were run. The

data from the sensors and the jetting deposits was given to the neural network and

training was performed. Finally, the accuracy of the neural network was evaluated and

conclusions were drawn.

During the design of the project the replicability has been thought of to enable

future improvements to the project. It is mentioned in [42] that replicability is one

of several quality assurances to be aware of when designing the methodology. For this

project, the procedure for how to setup the hardware components in order to gather

data is documented and also the verification and validation process of each component.

Furthermore, a seminar has been held at Mycronic for the engineers, for the purpose

of transferring knowledge of the project and how to replicate the tests.

27


28

Chapter 4

Implementation

The following chapter will first explain the implementation of the hardware and

software that have been developed and used throughout the project in Section 4.1

and Section 4.2, respectively. Furthermore, the procedure to perform the cases is

clarified in Section 4.3. Lastly, verification and validation of units of the system will be

presented in Section 4.4.

4.1 Hardware Configuration

This section describes how the machine, cassette and ejector were modified in order

to install the sensors. An overview of the hardware system is shown in Figure 4.1.1

and highlighted components are the Red Pitaya, the trigger signal from the piezo PCB

driver and the cassette containing the solder paste and ejector. The cassette contains

the current sensor and the ejector contains the temperature sensors. Furthermore,

a PCB was developed in order to enable the Red Pitaya to read the values from the

temperature sensors as well as the trigger signal from the MY700.

29

CHAPTER 4. IMPLEMENTATION

Figure 4.1.1: The hardware setup in the MY700 for data gathering.

Modifications to theMY700 included two Bayonet Neill–Concelman (BNC) coaxial

connectors that were connected to the debug pins of the piezo PCB to extract the trigger

signal, which is marked as ”Trigger signal” in Figure 4.1.1. This signal is used by the

Red Pitaya in order to initiate sampling from the sensors. The Red Pitaya wasmounted

on top of the horizontal beamwhich moves the printing head back and forth, as shown

in Figure 4.1.1. A power cable and an Ethernet cable were run via the wiring harness

out through the back of the machine. The wires for the sensors along with the coaxial

cable for the trigger signal were run the opposite direction up to the printing head.

Cable ties were used to attach the cables and extra caution was used to ensure that no

cables came in the way of the machinery.

Figure 4.1.2 illustrates the data transfer between the different hardware

components. The connection between the MY700 and the database, labeled ”Job

data”, represent the quality measurements taken by the machine. The data is

obtained through a series of photographs taken by the machine on the deposits. The

images undergo a processing stage to extract quality measurements, such as diameter,

positioning, shape and satellites. These measurements are then sent to the database

to be stored together with its corresponding sensor data.

30


Figure 4.1.2: Flow chart illustrating the data transfer between different components.

4.1.1 Sensors

In Chapter 1, it was explained that three positions for the temperature measurement

would be used, as shown in Figure 1.2.1. The positioning of the sensors was determined

by the work in [4], but a more extensive explanation of the importance of the

positions has been made in Chapter 1. Since these positions include components

and mechanisms that can be affected by changes in the rheological properties, which

is a function of temperature, it was decided to maintain these temperature sensor

positions. However, it was noticed that themounting of the sensors could be improved.

The previous assembly was such that the sensors were inserted into the drilled holes

and a glue gun was used to install them in that position. This setup ran the risk of

potentially insulating the sensors from the ejector chassis if some glue had entered the

hole. Thus, a new ejector was modified with the same configuration as in [4], but a

heat conductive paste was added into the holes with the sensors before fastening them

with a glue gun. This ensured a more accurate temperature reading from the sensors.

To measure the temperature, IT-18 thermocouples [50] were used which have an

accuracy of ±0.1 °C. These were the same type of sensors as the ones used in [4],

and the blue arrows in Figure 4.1.2 show the data measured by the thermocouples

being transferred to the Red Pitaya. In this project, the thermocouples measured

temperatures in the range of 20 °C to 40 °C. The signals from the thermocouples in

that range were between 1.196 mV to 1.612 mV [51]. These signals had to be amplified,

which is explained in Section 4.1.2 below. The thermocouple that was placed by the

ejector nozzle protruded slightly from the bottom of the ejector. This meant that the

distance between the nozzle and the substrate had to be increased from the default

value of 650 μm to 800 μm when operating the MY700.

31


To measure the current, it was decided to keep the setup used in [4] which was to

have a shunt resistor on the low side of, and in series with, the piezoelectric actuator.

This reduced any issues with the common-mode voltage. If the current and resistance

are not too high, a shunt resistor also dissipates low amounts of power since

PD = I2 ·R, (4.1)

where PD is the dissipated power, I is the current and R is the resistance. Thus, there

is less possible influence on the circuit. Two shunt resistors were mounted in parallel,

each with a resistance of Rs = 0.1 Ω, which resulted in a total resistance of Rs = 0.05

Ω. The green arrow in Figure 4.1.2 shows how the data measured by the shunt resistor

is transferred to the Red Pitaya.

4.1.2 Red Pitaya

It was decided that the same Red Pitaya as the one used in [4] would be used for

this project as well. The three thermocouples were connected to the Red Pitaya via

a PCB which was mounted on top of the Red Pitaya, as can be seen in Figure 4.1.3.

Three thermocouple amplifiers with cold junction compensation were used to amplify

the signals from the thermocouple. Cold junction compensation means that these

integrated circuits (ICs) use an ice point reference to provide a temperature reference

for the thermocouples, which was needed in order to make temperature readings.

Once amplified, a 10 mV change of the output signal corresponded to a 1 °C change

in temperature [52]. All output signals were then given to the analog input pins of the

Red Pitaya. A BNC coaxial connector was mounted on top of the PCB and connected

to one of the digital input pins of the Red Pitaya. This was used to register a trigger

signal which was sent out from the machine every time a shot was ejected. This trigger

signal let the Red Pitaya know when to take a measurement and is shown as the red

arrow in Figure 4.1.2. The temperature and trigger signals, along with ground and

5 V signals, were connected between the PCB and Red Pitaya using two flat cables

with a 26-way insulation-displacement contact (IDC) connector plug at each end. The

current sensor was connected directly to one of the 14-bit fast channels of the Red

Pitaya which had a sampling rate set to 15.6 MHz. This was done using another BNC

type connector, which is marked as the green arrow in Figure 4.1.2. The low voltage

input (±1 V) of the fast channel was used, since the signal would never exceed ±1

V with the chosen shunt resistor. The signal was further downsampled to 3.9 MHz

to reduce memory usage and increase efficiency. A sampling rate of 3.9 MHz was

deemed to be sufficient since this would give 390 samples, which for each current curve

would resolve the important trends. A 14-bit resolution was also deemed to be enough

32


since this gave a resolution of approximately 0.12 µV. Observations showed that this

sampling frequency and resolution gave a smooth and continuous curve. A schematic

overview of the Red Pitaya can be found in Appendix A.

Figure 4.1.3: The PCB used for data gathering.

4.2 Software Configuration

The two main scripts that were used in this project were the code for running the

MY700 and collecting data from the sensors during operation, and another for the

neural network. Both scripts were written in Python and the neural network was

designed with Tensorflow, which is an open-source software library for machine

learning in Python.

The first script had already been developed, but was designed for the earlierMY600

machine so some changes had to be made to adapt it to the MY700. When the MY700

was finished with a job, the data recorded by the Red Pitaya was stored in an open-

source database called MongoDB. Completing this step made the MY700 continue its

process by scanning the quality of the deposits and also send this data to the database,

as seen in Figure 4.1.2. Uploaded data could easily be accessed for furthermodification

before being fed to the neural network.

The design of the neural network followed the architecture of a stacked LSTM

model. It was decided to use an LSTM model for this project since its architecture

minimizes the potential risk of vanishing or exploding gradient compared to other

recurrent neural networks, and for its advantages in processing a sequence of data. The

different architectures of LSTM models have been evaluated by manually comparing

33


their performance, but it was concluded that stacked LSTM was the better performing

model on the data. The model predicted the diameter of the next shot based on the

data from the thermocouples and the current sensor. It was given a window size

of 10, that is, the number of previous values used for making predictions. For RT1,

the number of epochs was increased from 25 to 35 since it needed more time for the

loss to stagnate. Different optimizers, such as stochastic gradient descent (SGD) and

Adam have been evaluated for this problem and Adam showed the most promising

results. This optimizer has the advantage of adapting individual learning rates for

each parameter of the neural network instead of only using a constant learning rate

for all parameters as in, for example, SGD. The loss function in the neural network

was calculated using MAE, which is defined in Equation 2.7.

A summary of the LSTM architecture and its parameters that were used and

evaluated to suit the purpose of this project is shown in Table 4.2.1. This design was

determined bymanually tuning parameters of the network, deciding number of layers,

selecting the window and batch size, and recording the performance in order to decide

on a design. For the activation function in the output layer, both tanh and linear was

evaluated but linear showed the best results. The possible reason for that was that the

model predicts a continuous value and, thus, as described in Section 2.5, the linear

activation function could be used in the output layer.

Table 4.2.1: The LSTM architecture for evaluation.

TypeNetwork

architectureOptimizer

Window

sizeBatch size Epochs

Stacked

LSTM

Layer 1: 200

Dropout: 0.4

Layer 2: 200

Dropout: 0.4

Dense: 1

Activation: Linear

Adam 10 256 25/35

Before the data was fed to the neural network, it was standardized with a mean

of zero and a standard deviation of one. Since there were different ranges of the

inputs, it was beneficial for the neural network to receive standardized data to not

form a bias toward any of the inputs while predicting quality. It helped in the back-

propagation phase so the neural network converged more easily. After the data had

been standardized, it was split into training, validation and test data where the training

data was about 80% of the dataset, which is a commonly used distribution of the data.

During the training phase of the LSTMmodel, the training data was used and for each

34


completed epoch, validation of the learning processwas performedusing the validation

data. After each epoch, the loss was calculated and registered for both the training and

the validation process. When all epochs were completed, that is, the training of the

LSTM model was finished, the model was given the unseen test data which evaluated

its performance. The input data to the model had to have three dimensions: batch

size, time steps and features, where batch size is the number of samples per iteration,

time steps is the number of past time steps in one sample and features is the number

of observations in one time step.

4.3 Experimental Procedure

The case study examined two different cases, which are presented in Table 4.3.1 with

their associated variable configuration. The first case was a BGA job which was run

with constant variables while the second case was an RT1 job where the frequency and

diameter varied between 160-300 Hz and 330-520 µm, respectively. In both of these

cases, the measured variables were the temperature and the current.

Table 4.3.1: Variable configuration for the BGA and RT1 job. The BGA job has onecombination and the RT1 job has 15 unique combinations.

Diameter [µm] Frequency [Hz]

BGA job 380 200

RT1 job

330

370

429

482

520

160

230

300

The procedure was almost identical for the two cases. The MY700 was initialized

by first performing an extended purge which made sure that the ejector was filled with

solder paste. This was followed by a machine calibration procedure to ensure that the

solder paste deposits were acceptable. From the Python script, either the BGA or RT1

job was selected to be performed by the MY700 and it followed the instructions given

in Section 4.2. The BGA job created a 6x10 grid of generic BGA patterns where each

BGA contained 460 dots, as shown in Figure 2.1.3a. This resulted in 27,600 shots per

BGA job. The RT1 job consisted of several rows of solder paste deposits with different

combinations of diameter and frequency, in total 15 unique combinations and 34,200

35


shots. Several runs were made for both job types, and data was collected from almost

400,000 shots in total for each job type. Out of these, the last 50,000 shots were

used as test data to evaluate the performance of the neural network after training was

completed. For both BGA and RT1, each deposit had one temperature data point per

temperature sensor and 390 current data points. When the data from the runs were

available in the database, it was downloaded and given to the neural network. How the

data was managed before fed to the neural network is explained in Section 4.2.

The development of the neural network was an iterative process in terms of

parameter settings. However, Table 4.2.1 shows the architecture together with its

parameters that had the best performance on the data. In order to answer the research

question, five cases for the LSTMmodel were designed as seen in Table 4.3.2. The first

case has the current data as input and served as a baseline. The next three cases used

data from each individual temperature position with supporting data from the current.

Finally, a test with all of the sensors was performed to see if all the sensors together

increased the performance. The cases in Table 4.3.2 were performed with a BGA job.

The case that showed the most promising performance was tested with the RT1 job.

This was done in order to test the robustness of the best LSTMmodel found.

Table 4.3.2: The BGA job evaluation cases for the LSTM model, each having differentinput configurations to the model.

Parameter

Case Current Temp 1 Temp 2 Temp 3

1 X - - -

2 X X - -

3 X - X -

4 X - - X

5 X X X X

36


4.4 Verification and Validation

This subsection will explain the procedure for verification and validation of the sensors

in order to meet the requirements and to answer the research question.

4.4.1 Thermocouples

Before the thermocouples were mounted on the ejector, they were tested and

calibrated. The calibration procedure started with connecting the sensors to the PCB.

The tips of the sensors were kept at the same position during calibration. A reference

thermometer was used to register the ambient air temperature. The probe of the

thermometer was placed together with the sensors. Before any calibration tests were

performed, the thermometer had to adopt the surrounding temperature. Within two

hours the thermometer had stabilized at 23.6 °C. The calibration was based on three

tests, where the mean value of the data from each thermocouple was calculated and

used in order to add a software offset such that the thermocouples showed 23.6 °C.

Figure 4.4.1a shows the thermocouples before calibration and Figure 4.4.1b shows the

thermocouples after calibration.

(a) (b)

Figure 4.4.1: Measured temperature by the thermocouples. (a) The measuredambient temperature before calibration. (b) Calibrated thermocouples to surroundingtemperature (23.6 °C).

It was also important that the sensors could respond to variations in temperature.

This was tested with the thermocouples calibrated. In order to validate that

the thermocouples responded to temperature changes, they were placed in room

temperature, 23.6 °C, and then placed between two fingertips to later be released. It

can be seen in Figure 4.4.2 that the sensors responded to the temperature changes.

37


Figure 4.4.2: Calibrated thermocouples. The response from each thermocouple astemperature is varying.

In order for the wires of the thermocouples to not interfere with the movement of

the MY700, they had to be routed along the high current cables to the cassette holder.

The wires of the thermocouples only have a thin layer of insulation, which put them

at risk of being affected by electromagnetic interference (EMI). It was observed that

when the thermocouplesweremounted on the ejector in theMY700 the signals became

more noisy, as seen in Figure 4.4.3a. This was likely due to the EMI from the cables of

the machine. A low pass filter was applied to the measured signal to compensate for

this. Since the temperature is expected to change relatively slowly, a cutoff frequency

of fcutoff = 3Hz was chosen. The filtered signal is shown in Figure 4.4.3b.

(a) (b)

Figure 4.4.3: Measured temperature by the thermocouples during an RT1 job. (a) Themeasured temperature before filtering. (b) The measured temperature after filtering.

38


4.4.2 Shunt resistor

The positioning of the shunt resistor is shown in the simplified schematic in Figure

4.4.4. The Red Pitaya measures the voltage drop across the resistance to calculate the

current using Equation 2.2.

Figure 4.4.4: A schematic of the positioning of the shunt resistor that is on the lowside of the piezo. The resulting resistance is R = 0.05 Ω by having two shunt resistors,Rs = 0.1 Ω, in parallel [4].

The first part of verifying and validating the shunt resistor was to verify a similar

behaviour of the current of the shots as in [4]. This was achieved and the current

curve can be seen in Figure 4.4.5. This figure shows the mean value and the standard

deviation of all the shots from a BGA job.

Figure 4.4.5: Measured piezo current when jet printing a BGA job with the diameterof the deposits being 380 µm. The dark blue curve shows the mean current of all shotsand the shaded area around the curve illustrates the standard deviation.

However, it was also of interest to evaluate potential noise from the circuit, which

was done by connecting the probe and its ground to the high side of the shunt resistor.

This resulted in values being different from zero as different BGA jobs were performed,

as seen in Figure 4.4.6a, indicating that there was noise affecting the measurement.

39


The diameter range in which the ejector deposited during the RT1 test was between

330 µm and 520 µm. Figure 4.4.6a indicates a difference in noise depending on the

diameter of the deposited solder paste. Furthermore, the RT1 operated between 160

Hz and 300 Hz and the noise from the extreme values shows almost identical curves

in Figure 4.4.6b, showing that the frequency had a small effect on the noise.

(a) (b)

Figure 4.4.6: Measured noise from the current sensing with different BGA job settings.(a) Diameters: 330 µm, 380 µm, 429 µm, 520 µm. Frequency constant at 300 Hz. (b)Frequencies: 160 Hz, 300 Hz. Diameter constant at 330 µm.

It was suggested to subtract the measured noise from the measured current. This

resulted in Figure 4.4.7, which has a similar shape as the waveform in Figure 1.2.2.

Also, Figure 4.4.7 shows a rise time of 34 µs and a plateau time of 50 µs which

corresponded to the parameter settings for the job, indicating the waveform is correct.

Figure 4.4.7: Calibrated measurement of current in a BGA job with the diameters ofthe deposits being 380 µm. The dark blue curve shows the mean current of all shotsand the shaded area around the curve illustrates the standard deviation.

40


4.4.3 Prediction Model

The developed LSTM model is seen in Table 4.2.1 and, to verify and validate its

architecture and parameter settings, it was given the time series data set in [53] and the

prediction should conform with the predictions made in [53]. The data set is used for

airline passenger prediction and this project’s LSTMmodel gave similar predictions as

the one in [53], which verifies that the model can make time series predictions. Figure

4.4.8a shows the predictions by the LSTMmodel in [53], while Figure 4.4.8b shows the

predictions made by the LSTM model from this project. The model designed for this

project conforms with the model in [53], thus, it is deemed to be verified and validated

for its purpose.

(a) (b)

Figure 4.4.8: Verification and validation of the LSTMmodel developed in this projectby comparing its predictions with the LSTM model in [53]. (a) The training and testpredictions of the LSTM model in [53]. (b) The training and test predictions of thisproject’s LSTMmodel.

41


42

Chapter 5

Results

In this chapter, the results from the test cases are shown. Section 5.1 gives the results

of the performance of the LSTMmodel using the BGA job, while Section 5.2 shows the

results while using the RT1 job. Loss curves, which have been described in Chapter

2, are presented. Graphs which compare the true and predicted diameter, as well as

graphs showing the distribution of the predictions, are also presented. The metrics

used for evaluating the diameter prediction of the LSTM model are MAE, MRE and

also the number of predictions where the relative error of the predicted diameter was

greater than 8% (Bad predictions). These metrics are presented in tables. Section 5.3

addresses to what extent the requirements were fulfilled in this project.

5.1 BGA Results

The training process of the LSTM model using the BGA job is presented in Figure

5.1.1, which shows the training and validation loss when the LSTM model was given

all sensor data. The loss of the other sensor combinations from Table 5.1.1 are shown

in Appendix B.

43

CHAPTER 5. RESULTS

Figure 5.1.1: Training and validation loss when training the LSTMmodel using a BGAjob. Trained for 25 epochs. Input parameters are the temperature from the threethermocouples and the current.

Table 5.1.1 contains the BGA test results with different combinations of input

parameters as shown in Table 4.3.2. Three runs were performed for each combination

using the test data, and the average values from the runs are presented in the table. The

input parameters are, asmentioned earlier, the three temperature sensors (T1, T2, T3),

whose placements are defined in Figure 1.2.1, and the current (I). Results when only

temperature data and no current data is used is shown in Appendix C.

Table 5.1.1: BGA test results for different sensor combinations.

Sensor Combination

I I, T1 I, T2 I, T3 I, T1, T2,

T3

MAE [µm] 5.81 5.82 5.77 5.82 5.82

MRE [%] 1.52 1.52 1.51 1.52 1.52

Bad predictions 71 94 61 77 80

In addition to the table above, showing the performance of the LSTM model on

unseen test data, graphs of the predicted and true diameter of each shot are shown

below. These graphs are shown in Figure 5.1.2, Figure 5.1.3 and Figure 5.1.4, where

each figure has two subfigures with different window scales: one which shows all test

data and one which shows a smaller region chosen arbitrarily. Figure 5.1.2 only has

current data as input, Figure 5.1.3 has input data from the three temperature positions

44

CHAPTER 5. RESULTS

individually, together with supporting data from the current sensor, and Figure 5.1.4

uses all sensor data as input.

(a) (b)

Figure 5.1.2: Predicted and true diameter using a BGA job with I as input. (a) Thecomplete run of the test data. (b) Diameters of 250 shots between shot 14,000 and14,250.

(a) (b)

Figure 5.1.3: Predicted and true diameter using a BGA job with (T1, I), (T2, I) and (T3,I) as input. (a) The complete run of the test data. (b) Diameters of 250 shots betweenshot 14,000 and 14,250.

Distribution graphs of the predicted and true diameter of each run are shown in

Figure 5.1.5. In Figure 5.1.5b, the blue and red curves are hard to distinguish since

they directly overlap. The green curve protrudes at the top, but the majority of its body

overlaps with the other curves.

45

CHAPTER 5. RESULTS

(a) (b)

Figure 5.1.4: Predicted and true diameter using a BGA job with T1, T2, T3 and I asinput. (a) The complete run of the test data. (b) Diameters of 250 shots between shot14,000 and 14,250.

(a) (b)

(c)

Figure 5.1.5: Distribution of predicted diameters and distribution of true diametersusing a BGA job. (a) Input variable is I. (b) Input variables are I and each of thetemperature sensors. (c) Input variables are I, T1, T2 and T3.

46

CHAPTER 5. RESULTS

5.2 RT1 Results

The training process of the LSTMmodel using the RT1 job is presented in Figure 5.2.1,

which shows the loss of the training and validation.

Figure 5.2.1: Training and validation loss when training the LSTMmodel using an RT1job. Trained for 35 epochs.

Table 5.2.1 contains the RT1 test results with the combined sensor data as input.

Three runs were performed for this combination using the test data and the average

values from the runs are presented in the table.

Table 5.2.1: RT1 test results for selected sensor combination.

Sensor Combination

I, T1, T2, T3

MAE [µm] 6.22

MRE [%] 1.50

Bad predictions 132

Figure 5.2.2 shows the predicted and true diameter with two different window

scales with a sensor combination of I, T1, T2 and T3 as presented in Table 5.2.1. In

Figure 5.2.2a, each plateau contains shots deposited at 160, 230 and 300 Hz. Figure

5.2.3 shows the distribution of the predictions. Appendix C shows the graphs of the

performance of the LSTMmodel when only given the temperature data as input.

47

CHAPTER 5. RESULTS

(a) (b)

Figure 5.2.2: Predicted and true diameter using an RT1 job with I, T1, T2 and T3 asinput. (a) The complete run of the test data. (b) Diameters of 250 shots between shot8,000 and 8,250.

Figure 5.2.3: Distribution of predicted diameters made by the LSTM and the truediameters distribution using an RT1 job. The input configuration is I, T1, T2, and T3.

5.3 Fulfillment of Requirements

This section relates back to the requirements and answers to what extent they were

fulfilled based on the results from the cases.

• A neural network shall be trained to predict changes in the quality of jetting

deposits which later can be used for real-time prediction.

The neural network identified trends of the diameter, but variations between shots

could not be identified accurately by the neural network. It was more accurate

48

CHAPTER 5. RESULTS

whenever themeasured diameter followed a repeating pattern thanwhen it wasmostly

random. The neural network was trained and evaluated off-line to tune its weights so

that the model could get real-time sensor input to predict quality in terms of diameter.

• Temperature and current shall be measured and the data shall be used as input

to the neural network.

Both temperature and current were measured while performing the BGA and the RT1

job. The data was given to the neural network in order to make predictions about

the quality of the deposits. Improvements from previous work were made to both

the temperature and the current measurements to assure as accurate input data as

possible.

• Three different locations on the ejector shall be used for temperature

measurements.

The same configuration was used as in the previous year’s master’s thesis: a sensor at

the end of the Archimedes screw, at the solder paste container in the ejector and at the

chamber next to the piston. These positions are shown in Figure 1.2.1.

• The quality of the solder paste shall be based on the diameter of the shots and

these shall be measured using a MY700 jet printer.

The quality of the deposits, for this project, was based on their diameter. The neural

network predicted the diameter of the deposits based on its two types of sensor data

input. The tests were performed using the MY700, which is able to take quality

measurements of the deposits.

• Acceptable results from the neural network require the predicted diameter to

vary less than 8% from the actual diameter for individual predictions.

Not all of the predictions were within the acceptable range of 8% from the actual

diameter, but the majority were. For the BGA job, 80 out of 50,000 predictions were

non-acceptable when all available sensor data was used, and for the RT1 job, 132 out

of 50,000 predictions were non-acceptable.

49

CHAPTER 5. RESULTS

50

Chapter 6

Discussion

In this chapter, the findings from the case study are discussed, including the BGA and

RT1 results, the sensor implementation, the requirements and the research method.

6.1 Test Cases

The results from the different test cases are shown in Chapter 5. The performance of

the neural network for these test cases is discussed below.

6.1.1 BGA

The first job that was examined was the BGA job, since this was considered to be the

simpler case. The neural network was first trained using only the three temperature

sensors as input to get an idea if the thermocouples alone would be enough to make

accurate diameter predictions. Since the current is not used in this case, it does

not relate directly to the research question. However, it does help to improve the

understanding of the effects of the current. The predictions for this case can be seen in

Figure C.0.1 in Appendix C. From the graphs, it becomes clear that the neural network

is not able to accurately predict the diameter of individual shots, but rather find an

average value for the diameter which the predictions are always close to. This was

expected considering that the temperature had been filtered with a low pass filter,

meaning that large differences in temperature between consecutive shots had been

filtered out. This was done to compensate for the noise, as mentioned in Section 4.4.1.

However, in the test data there are two clear deviations of themeasureddiameterwhere

the diameter deviates downwards, which can be seen in Figure C.0.1a. For those shots,

the predicted diameter reacts by oscillatingmore, which is an indication that the neural

network is able to notice some sort of deviation. While this configuration of sensors

was able to find a good average value for the BGA job, it did not work as well for the

51

CHAPTER 6. DISCUSSION

RT1 job where the diameter varied, as seen in Appendix C. This is because it was not

able to anticipate the diameter changes from the temperature alone.

Next, the neural network was trained using only the current as input. Figure 5.1.2

shows the resulting predictions from this training. The graphs show that the neural

network no longer only predicts values close to the mean diameter, indicating the

current gives more information about individual shots. This is also confirmed by

looking at the distribution graph in Figure 5.1.5a, where a larger distribution for the

predictions can be seen than in Figure C.0.1b in Appendix C. It is also still possible to

find the two locations where the diameter deviates downwards. The average MAE and

MRE for this configuration were 5.81 μm and 1.52%, respectively, which can be seen

in Table 5.1.1. One of the requirements for this project was that the predicted diameter

should vary less than 8% from the actual diameter. The number of predictions where

the relative error was larger than 8% for this configuration was 71 out of 50,000 shots.

The next three cases were used to compare the viability of the different temperature

sensor locations. Each of the temperature sensors was tested together with the current.

The results from these cases can be seen inTable 5.1.1, and the numbers show that there

is no significant difference between the different locations. Using the current together

with Sensor 2 gave the least amount of bad predictions. However, the amount of bad

predictions could vary by 40 shots from run to run. This means that the differences

were likely not significant, which is also supported by theMAEandMREnumbers. The

numbers for these three cases are also similar to the case which only had the current as

input, suggesting that the current is responsible for the majority of the performance.

Figure 5.1.3 shows the predictions for all three cases, and it shows that the three cases

performed similarly. The distributions can be seen in Figure 5.1.5b, and they are also

similar for the three cases.

The final configuration that was tested with the BGA job was with all sensors.

Theoretically this should lead to the best performance since it has the most data

available, but the error was approximately the same as the previous cases, as shown

in Table 5.1.1. Figures 5.1.4 and 5.1.5c also show that the performance is similar. The

distribution seems to be slightly better when all sensors were used than when only the

current was used, but the difference is not significant and it could be due to results

varying slightly between runs. The loss curves for this configuration can be seen in

Figure 5.1.1. From this graph it can be seen that the training loss decreases while the

validation loss does not. This gives an indication that the neural network is struggling

to learn, that is, overfitting to the training data.

52


6.1.2 RT1

The input parameters to the neural network while using the RT1 job was decided to

be both types of sensor data, since the temperature could find slower trends while the

current was better for single dot prediction to some extent. The purpose of this case

was to validate the robustness of the neural network by varying the diameter of the

deposits as well as the operating frequency. By comparing Table 5.2.1 and Table 5.1.1

it can be seen that the numbers are roughly the same, but with a slightly larger number

of bad predictions and larger MAE for the RT1 job. Training the neural network on a

more complicated pattern did not significantly decrease the quality of the predictions.

Figure 5.2.3 shows that there is a bigger spread of the predictions for the larger

deposited diameters. This does not tell us anything about the accuracy of the

predictions, but rather the ability to identify variations in the ejected diameters. From

Figure 5.2.2b and Figure 5.2.3 it can be seen that the neural network underestimates

the true diameter. The distribution of the predictions in Figure 5.2.3 is shifted slightly

to the lower range of diameters, which is exemplified in Figure 5.2.2b where the

majority of the predictions are in the lower range. This is clearly evident for the larger

deposits, while not so much for the smaller deposits.

Figure 5.2.1 shows that most of the learning happened in the earlier epochs, with

only a slight decrease in loss in the remaining epochs. A larger number of epochs

could potentially lead to overfitting, as for the BGA test, which would make the neural

network poor at generalizing on unseen data. However, one can distinguish a low

degree of underfitting, which causes the neural network to make poor decisions about

the underlying structure of the data. In other words, it makes assumptions about the

data rather than finding relationships between the inputs and outputs. This could be

an important finding since the quality of the predictions varies between the different

ejected diameters. That could possibly be explained by the underfitting.

6.1.3 Neural Network Performance

Since the measured diameters of most shots in both job types were concentrated

around the expected value, the neural network could base its predictions in that range

without being heavily penalized. Trying to predict the anomalies and larger deviations

will come at a risk of increasing the error if the prediction is not accurate. In other

words, the model needs to be able to see clear correlations in the data in order to

accurately predict larger deviations. From the graphs in Chapter 5, it could thus be said

that the data the LSTMmodel is working with does not have clear enough correlations

between input data and the measured diameter. This could be due to factors, such as

noisy input data or that the temperature and current are not sufficient on their own.

53


6.2 Sensors

As mentioned earlier, the thermocouples were affected by noise. Since the sensor

cables were mounted along the high current cables going to the piezo, the most

probable reason for the measurement noise is EMI. The low pass filter reduced this,

but it was difficult to design the filter such that only the noise would be filtered out and

that the true temperature signal would not be affected. It should also be noted that

the measurement from the three different locations of the temperature sensors gave

approximately the same temperature, as seen in Figure 4.4.3b. This could possibly

be explained in two ways: the aluminum body of the ejector diffused the generated

heat efficiently and adopted the same temperature, or not enough additional heat

was emitted from the locations to be registered by the thermocouples. Furthermore,

the possible heat generated at the three sensor locations, see Figure 1.2.1, most likely

introduced a lag since the heat had to cross the aluminumbody before it wasmeasured.

This could have impacted the quality of the prediction, especially when only having the

temperature data as input.

The temperature seemed to be fairly constant throughout the jobs, where the overall

trend showed that the temperature increased slightly as the jobwent on. Themachine’s

ejector temperature was set to 29 °C, and Figure 4.4.3b shows that the measured

temperature from the thermocouples was close to that target. In last year’s master’s

thesis, the temperature varied more when the ambient temperature was 18 °C [4].

When it was 24 °C, it was mostly constant. The MY700 used in this project was

standing in a room where the ambient temperature was 23-24 °C, so the fact that the

temperature was fairly constant during a job coincides with the previous results in [4].

If the temperature in the room was lower, meaning that the temperature would likely

varymore, it is possible that the gathered temperature data would have a larger impact

on the performance.

When removing the noise from the current curve, the average noise curve was

measured and saved for each diameter. This was because the graphs in Figure 4.4.6

showed that it was the diameter that affected the noise and not the frequency. When

subtracting the noise from the measurement, the correct noise curve was chosen

depending on what the diameter of the shot was. A more accurate way of removing

the noise would have been to measure the noise for each individual shot, and this

would also improve the internal validity of this project. This was not feasible in this

project however, since it would double the amount of data that needed to be stored

and processed. High memory usage was already an issue in this project, and therefore

this method was not used.

54


6.3 Requirements

The requirements of this project were all fulfilled, except the last one. This requirement

was considered non-fulfilled since the predicted diameter did not follow the true

diameter confidently so that individual anomalies could be detected. This resulted

in some predictions being outside the 8% range. In the BGA job, the LSTM model

seemed able to find deviating trends, but in the RT1 job the model was struggling to

find deviating trends in the quality. That is exemplified by the fourth plateau in Figure

5.2.2a, which in its second part has a deviating trend towards smaller diameters that

the model could not predict. A reason for making it difficult for the LSTM model to

predict the quality could be the random variations in the diameter. It is possible that

the amplitude of the variations could have increased slightly as a consequence of raising

the nozzle from 650 µm to 800 µm above the surface due to the protruding sensor.

That difference in height will have an impact on the quality of the deposits since the

time of flight between the nozzle and the surface increases, causing the diameter of the

droplets to vary more.

6.4 Research Method

The use of a case study was appropriate for this project since it enabled knowledge to

be obtained regarding what affects the quality of jetting deposits. Using two job types,

BGA and RT1, meant that a clearer evaluation of the neural network could be done.

That also elevated the internal validity in terms of reliable results. It is exemplified

by the performance of the neural network with only the temperature as input as seen

in Appendix C. The results from the BGA job indicate capabilities of finding the mean

value of the job sequence, but the results from the RT1 job discard this hypothesis

since the job is composed of five unique sequences in terms of diameter. Thus, any

distinct differences in the diameter in the job sequence will cause the temperature data

to be insufficient for making predictions. The robustness could also be tested when the

diameter varied in the RT1 job. Asmentioned above in Section 6.2, the internal validity

of the project was impacted by noisy measurements. Measures were taken to reduce

the effects of the noise as much as possible, but as mentioned in Chapter 8, it can be

looked into further in future work. The external validity could also be improved in

future projects, and this is also discussed in Chapter 8.

55


56

Chapter 7

Conclusions

The purpose of this project, as reflected in the research question, was to evaluate the

usefulness of three temperature sensors’ positions in regard to increasing the accuracy

of a neural network used for predicting jetted solder paste quality. As mentioned in

Chapter 1, the research question is:

In a piezo-based material depositing device, what are the implications

of the predetermined temperature sensor positions, when providing

supporting data from a current sensor, in regard to increasing the

accuracy of predicting jetted solder paste quality by training a neural

network?

The case study performed in this project has shown that none of the temperature

sensors significantly improved the performance of the neural network, and there were

no considerable differences between the three sensors. However, the temperature data

seemed to be able to help the neural network recognize slower trends of the diameter

that lasted over several shots, even though it did not have a large impact on the

accuracy, while the current was more useful for individual shot prediction. The same

behaviour was noticed for both types of jobs used. This summarizes the hypothesis

which has been created in this project, while simultaneously answering the research

question.

All the requirements of the project were fulfilled except for the last one which

concerned the accuracy of the predictions. The main reasons for this are thought to

be that the data used did not contain enough information to make predictions at that

level of accuracy, and also that noise affected the quality of the measurements. The

degree of randomness of the diameter was also higher than expected, making the task

more difficult. Possible future work is discussed in Chapter 8 below, and includes

investigating other types of sensor data, reducing the amount of noise and investigating

other measurements of quality. The project has been conducted with future use in

57

CHAPTER 7. CONCLUSIONS

mind, and the neural network can be adapted to be used on different machines and

with different input and output data. With some further improvements, the neural

network could be utilized for making accurate real-time predictions, which would be a

benefit in the jet printing process.

58

Chapter 8

Future Work

This master’s thesis explored the possibility of collecting two different types of sensor

data from components of the MY700 jet printing machine and using a neural network

to process that data in order to make predictions about the quality of the deposits. The

project was the first in combining these two elements and examining its usefulness

in this area of engineering at Mycronic. It has been concluded that there are two

alternative ways to proceed with this project in the future, either more research on

improving the neural network or investigating other variables to be measured that

could have more impact on the quality. There are also interesting possibilities for

future work in the long term, once the neural network has been improved. This

includes looking into creating an interface which in real time informs the user of the

predicted quality, and also creating a control loop which adjusts the jetting parameters

in real time.

Further improvements of the software should include evaluating different types of

neural network architectures on existing data. By comparing different architectures,

one can more easily make conclusions regarding the importance of the sensor data

for quality predictions, as well as the most suitable type of neural network. In this

direction of future improvements, investigating othermeasurements of quality, that is,

shape of deposits, number of satellites, etc., could be of interest. Since the mounting

of Sensor 1, see Figure 1.2.1, required raising the nozzle about 150 µm, a consequence

wasworsened deposit quality. Thus, any future evaluation of different neural networks

should consider redesigning the mounting of Sensor 1 so the nozzle is in no need of

being raised.

The other option for improving this project would bemore focused on the hardware

configuration. This includes performing tests on different machines and ejectors,

which would improve the external validity, but also investigating other types of sensor

data. However, evaluating on different machines and ejectors should be the secondary

option, while investigating other sensor data should be the primary option, since it is

59

CHAPTER 8. FUTUREWORK

desired to first improve the performance of the neural network. One suggestion for

sensor data that would be of interest is the actual voltage level to the piezo as seen in

Figure 2.1.2. That figure shows the desired voltage at the piezo from the parameter

settings and currently there is no feedback of the actual voltage level. Since the voltage

level is noisy, some effort would also have to be made to effectively filter the signal. It

would also be of interest to look into different ways of shielding the thermocouples

from EMI from the machine. This would reduce the noise and probably make the

temperature data easier to correlate to the quality of the deposits, while also improving

the internal validity.

60

References

[1] N. Coenen. Industry trends are boosting Jet Printing. Mycronic AB. 2015. URL:

https://www.smta.org/chapters/files/SMTA- Capital- Chapter- 2015-Industry-trends-boosting-Jet-Printing.pdf (visited on 11/18/2019).

[2] Mycronic AB. URL: https://www.mycronic.com/en/about-mycronic/ (visited

on 01/20/2020).

[3] E. Kolibacz. “Classification of incorrectly picked components

using Convolutional Neural Networks”. Master’s Thesis. KTH Royal Institute

of Technology, 2018.

[4] B. Björnsdóttir. “Feedback strategies to decrease droplet variability in drop-on-

demand deposition of complex fluids”. Master’s Thesis. KTH Royal Institute of

Technology, 2019.

[5] M. D. Baker, C. D. Himmel, and G. S. May. “Time series modeling of reactive

ion etching using neural networks”. In: IEEE Transactions on Semiconductor

Manufacturing 8.1 (Feb. 1995), pp. 62–71. DOI: 10.1109/66.350758.

[6] B. Zhang and G. S. May. “Towards real time fault identification in plasma

etching using neural networks”. In: IEEE/SEMI 1998 IEEE/SEMI Advanced

Semiconductor Manufacturing Conference andWorkshop. Sept. 1998, pp. 61–

65. DOI: 10.1109/ASMC.1998.731394.

[7] C. J. Spanos, H. F. Guo, A. Miller, and J. Levine-Parrill. “Real-time statistical

process control using tool data (semiconductor manufacturing)”. In: IEEE

Transactions on Semiconductor Manufacturing 5.4 (Nov. 1992), pp. 308–318.

DOI: 10.1109/66.175363.

[8] S. Mallik, M. Schmidt, R. Bauer, and N. N. Ekere. “Influence of solder paste

components on rheological behaviour”. In: 2008 2nd Electronics System-

Integration Technology Conference. Sept. 2008, pp. 1135–1140. DOI: 10.1109/ESTC.2008.4684512.

61

https://www.smta.org/chapters/files/SMTA-Capital-Chapter-2015-Industry-trends-boosting-Jet-Printing.pdf

https://www.smta.org/chapters/files/SMTA-Capital-Chapter-2015-Industry-trends-boosting-Jet-Printing.pdf

https://www.mycronic.com/en/about-mycronic/

https://doi.org/10.1109/66.350758

https://doi.org/10.1109/ASMC.1998.731394

https://doi.org/10.1109/66.175363

https://doi.org/10.1109/ESTC.2008.4684512


REFERENCES

[9] A. E. Marks, S. Mallik, N. N. Ekere, and A. Seman. “Effect of temperature on

slumping behaviour of lead-free solder paste and its rheological simulation”. In:

2008 2nd Electronics System-Integration Technology Conference. Sept. 2008,

pp. 829–832. DOI: 10.1109/ESTC.2008.4684459.

[10] J. Leal, G. Mårtensson, and N. Augustis. Solder Paste Jetting: An Integral

Approach. Mycronic AB. Nov. 2018. URL: http : / / smt . iconnect007 . com /index . php / article / 113955 / solder - paste - jetting - an - integral -approach/113958/?skin=smt (visited on 01/16/2020).

[11] S. X. Fu. “Finding Optimal Jetting Waveform Parameters with Bayesian

Optimization”. Master’s Thesis. KTH Royal Institute of Technology, 2018.

[12] APC International Ltd. Piezo Theory. 2016. URL: https : / / www .americanpiezo . com / knowledge - center / piezo - theory . html (visited on

01/17/2020).

[13] H. C. Liaw, B. Shirinzadeh, and J. Smith. “Sliding-Mode Enhanced Adaptive

Motion Tracking Control of Piezoelectric Actuation Systems for Micro/Nano

Manipulation”. In: IEEE Transactions on Control Systems Technology 16.4

(July 2008), pp. 826–833. ISSN: 2374-0159. DOI: 10.1109/TCST.2007.916301.

[14] Y. Ham, B. An,M. A. Trimzi, G. Lee, J. Park, and S. Yun. “An experimental study

on the displacement amplification mechanism driven by piezoelectric actuators

for jet dispenser”. In: 2016 International Conference on Manipulation,

Automation and Robotics at Small Scales (MARSS). July 2016, pp. 1–5. DOI:

10.1109/MARSS.2016.7561742.

[15] D. Collins. FAQ: What are stacked piezo actuators and what do they do? Nov.

2015. URL: https://www.motioncontroltips.com/faq-what-are-stacked-piezo-actuators-and-what-do-they-do/ (visited on 01/18/2020).

[16] J. Park and W. Moon. “Hysteresis compensation of piezoelectric actuators: The

modified Rayleigh model”. In: Ultrasonics 50.3 (2010), pp. 335–339. ISSN:

0041-624X. DOI: https://doi.org/10.1016/j.ultras.2009.10.012. URL:http://www.sciencedirect.com/science/article/pii/S0041624X09001498.

[17] APC International Ltd. Stripe Actuators. 2016. URL: https : / / www .americanpiezo.com/standard- products/stripe- actuators.html (visited

on 01/18/2020).

[18] J. Vinnars and J. Vinnars. “Correlations Between Rheological Properties and

Jetting Results in Solder Paste Jetting”. Master’s Thesis. Uppsala Universitet,

June 2017.

62


http://smt.iconnect007.com/index.php/article/113955/solder-paste-jetting-an-integral-approach/113958/?skin=smt



https://www.americanpiezo.com/knowledge-center/piezo-theory.html

https://www.americanpiezo.com/knowledge-center/piezo-theory.html

https://doi.org/10.1109/TCST.2007.916301

https://doi.org/10.1109/MARSS.2016.7561742

https://www.motioncontroltips.com/faq-what-are-stacked-piezo-actuators-and-what-do-they-do/

https://www.motioncontroltips.com/faq-what-are-stacked-piezo-actuators-and-what-do-they-do/

https://doi.org/https://doi.org/10.1016/j.ultras.2009.10.012

http://www.sciencedirect.com/science/article/pii/S0041624X09001498

https://www.americanpiezo.com/standard-products/stripe-actuators.html

https://www.americanpiezo.com/standard-products/stripe-actuators.html

REFERENCES

[19] D. E. Alexander. “Chapter 4 -BiologicalMaterials BlurBoundaries”. In:Nature’s

Machines. Academic Press, 2017, pp. 111–114. ISBN: 978-0-12-804404-9.

[20] R. P. Chhabra and J. F. Richardson. “Chapter 1 - Non-Newtonian Fluid

Behaviour”. In:Non-Newtonian Flow and Applied Rheology (Second Edition).

Second Edition. Oxford: Butterworth-Heinemann, 2008, pp. 1–55. ISBN: 978-

0-7506-8532-0.

[21] N. Chandran, S. Chandran, and S. Thomas. “Chapter 1 - Introduction to

rheology”. In: Rheology of Polymer Blends and Nanocomposites. Micro and

Nano Technologies. Elsevier, 2020, pp. 1–17. ISBN: 978-0-12-816957-5.

[22] M. Judd and K. Brindley. “6 - Solder paste”. In: Soldering in Electronics

Assembly (Second Edition). Second Edition. Oxford: Newnes, 1999, pp. 109–

126. ISBN: 978-0-7506-3545-5.

[23] M. M. Schwartz. Soldering: Understanding the Basics. ASM International,

2014. ISBN: 9781627080583.

[24] E. Landman. “Viscosity control of solder paste by ultrasound actuation”.

Master’s Thesis. KTH Royal Institute of Technology, 2018.

[25] D. Ibrahim. “Chapter 3 - Thermocouple Temperature Sensors”. In:

Microcontroller Based TemperatureMonitoring andControl. Oxford: Newnes,

2002, pp. 63–85. ISBN: 978-0-7506-5556-9.

[26] P. R. N. Childs. “5 - Thermocouples”. In: Practical Temperature Measurement.

Oxford: Butterworth-Heinemann, 2001, pp. 98–144. ISBN: 978-0-7506-5080-

9.

[27] National Instrument. Current Measurements: How-To Guide. Oct. 2019. URL:

http://www.ni.com/tutorial/7114/en/ (visited on 04/28/2020).

[28] N. Patin. “1 - Sensors for Power Electronics”. In: Power Electronics Applied

to Industrial Systems and Transports. Elsevier, 2016, pp. 1–73. ISBN: 978-1-

78548-033-1.

[29] M. Ossmann.Red PitayaNot just a USB scopemodule. Nov. 2014. URL: https:/ / www . elektormagazine . com / assets / upload / files / EN2014120381 . pdf(visited on 01/19/2020).

[30] H. Baggen. Review: The new Red Pitaya line. Nov. 2014. URL: https://www.elektormagazine.com/news/review-the-new-red-pitaya-line (visited on

01/19/2020).

[31] Digi-Key Electronics. Red Pitaya STEMlab Device. July 2018. URL: https://www.digikey.ro/en/ptm/t/trenz/red-pitaya-stemlab-device/tutorial(visited on 01/19/2020).

63

http://www.ni.com/tutorial/7114/en/

https://www.elektormagazine.com/assets/upload/files/EN2014120381.pdf

https://www.elektormagazine.com/assets/upload/files/EN2014120381.pdf

https://www.elektormagazine.com/news/review-the-new-red-pitaya-line

https://www.elektormagazine.com/news/review-the-new-red-pitaya-line

https://www.digikey.ro/en/ptm/t/trenz/red-pitaya-stemlab-device/tutorial

https://www.digikey.ro/en/ptm/t/trenz/red-pitaya-stemlab-device/tutorial

REFERENCES

[32] R. Singh. FPGA Vs ASIC: Differences Between Them And Which One To Use?

July 2018. URL: https://numato.com/blog/differences- between- fpga-and-asics/ (visited on 01/19/2020).

[33] D. R. Baughman and Y. A. Liu. “1 - Introduction toNeural Networks”. In:Neural

Networks in Bioprocessing and Chemical Engineering. Boston: Academic

Press, 1995, pp. 1–20. ISBN: 978-0-12-083030-5. DOI: https : / / doi . org /10.1016/B978-0-12-083030-5.50007-2. URL: http://www.sciencedirect.com/science/article/pii/B9780120830305500072.

[34] X. Yang. “8 - Neural networks and deep learning”. In: Introduction to

Algorithms for Data Mining and Machine Learning. Academic Press, 2019,

pp. 139–161. ISBN: 978-0-12-817216-2. DOI: https://doi.org/10.1016/B978-0-12-817216-2.00015-6. URL: http://www.sciencedirect.com/science/article/pii/B9780128172162000156.

[35] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. http : / / www .deeplearningbook.org. MIT Press, 2016.

[36] N. Donges.Recurrent neural networks 101: Understanding the basics of RNNs

and LSTM. June 2019. URL: https : / / builtin . com / data - science /recurrent-neural-networks-and-lstm (visited on 01/23/2020).

[37] J. McGonagle, C. Williams, and J. Khim. Recurrent Neural Network. URL:

https : / / brilliant . org / wiki / recurrent - neural - network/ (visited on

01/23/2020).

[38] C. Olah. Understanding LSTM Networks. Aug. 2015. URL: http : / / colah .github.io/posts/2015-08-Understanding-LSTMs/ (visited on 01/26/2020).

[39] The MathWorks Inc. Long Short-Term Memory Networks. URL: https://se.mathworks.com/help/deeplearning/ug/long-short-term-memory-networks.html;jsessionid=2913b326abc1143e5efc7917a044 (visited on 01/26/2020).

[40] J. Brownlee. Long short-term memory networks with Python: develop

sequence prediction models with deep learning. v1.0. 2017.

[41] S. Rajasekar, P. Philominathan, and V. Chinnathambi. Research Methodology.

2006. arXiv: physics/0601009 [physics.gen-ph]. URL: https://arxiv.org/pdf/physics/0601009.pdf (visited on 04/24/2020).

[42] A. Håkansson. “Portal of Research Methods and Methodologies for Research

Projects and Degree Projects”. In: Proceedings of the International Conference

on Frontiers in Education : Computer Science and Computer Engineering

FECS’13. CSREA Press U.S.A, 2013, pp. 67–73.

64

https://numato.com/blog/differences-between-fpga-and-asics/

https://numato.com/blog/differences-between-fpga-and-asics/

https://doi.org/https://doi.org/10.1016/B978-0-12-083030-5.50007-2


http://www.sciencedirect.com/science/article/pii/B9780120830305500072






http://www.deeplearningbook.org

http://www.deeplearningbook.org

https://builtin.com/data-science/recurrent-neural-networks-and-lstm

https://builtin.com/data-science/recurrent-neural-networks-and-lstm

https://brilliant.org/wiki/recurrent-neural-network/

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

https://se.mathworks.com/help/deeplearning/ug/long-short-term-memory-networks.html;jsessionid=2913b326abc1143e5efc7917a044



https://arxiv.org/abs/physics/0601009

https://arxiv.org/pdf/physics/0601009.pdf

https://arxiv.org/pdf/physics/0601009.pdf

REFERENCES

[43] D. Ary, L. Cheser Jacobs, C. Sorensen, and A. Razavieh. Introduction to

Research in Education. 8th edition. Wadsworh, Cengage Learning, 2010.

[44] D. Muijs. Doing quantitative research in education with SPSS. Sage, 2010.

[45] D. E. Perry. Case Studies. 2004. URL: http://users.ece.utexas.edu/~perry/education/382c/L06.pdf (visited on 04/24/2020).

[46] M. Shuttleworth. Case Study Research Design. Apr. 2008. URL: https : / /explorable.com/case-study-research-design (visited on 01/28/2020).

[47] The Open University. Case Studies and Experiments. pp. 63-70. 2013. URL:

https : / / www . open . edu / openlearncreate / pluginfile . php / 50733 / mod _oucontent/oucontent/550/none/none/deh313_1blk2.12.pdf? (visited on

04/25/2020).

[48] R. B. Johnson. “Examining the validity structure of qualitative research”. In:

Education 118.2 (1997), p. 282.

[49] D. T. Campbell and J. C. Stanley. “Experimental and Quasi-Experimental

Designs for Research”. In: Handbook of Research on Teaching. Houghton

Mifflin Company, 1963, pp. 5–22. ISBN: 0-395-30787-2.

[50] Physitemp - Precision Temperature Specialists. 2019. URL: https : / /physitemp.com/ (visited on 05/27/2020).

[51] ITS-90 Table for type T Thermocouple (Ref Junction 0°C). Reotemp

Instruments. Nov. 2014. URL: https : / / www . thermocoupleinfo . com / pdf /type-t-thermocouple-reference-table.pdf (visited on 02/13/2020).

[52] Analog Devices Inc. AD594/AD595. Monolithic Thermocouple Amplifiers with

Cold Junction Compensation. URL: https://www.sparkfun.com/datasheets/IC/AD595.pdf (visited on 02/12/2020).

[53] J. Brownlee. Time Series Prediction with LSTM Recurrent Neural Networks

in Python with Keras. Aug. 2019. URL: https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/.

65

http://users.ece.utexas.edu/~perry/education/382c/L06.pdf

http://users.ece.utexas.edu/~perry/education/382c/L06.pdf

https://explorable.com/case-study-research-design

https://explorable.com/case-study-research-design

https://www.open.edu/openlearncreate/pluginfile.php/50733/mod_oucontent/oucontent/550/none/none/deh313_1blk2.12.pdf?

https://www.open.edu/openlearncreate/pluginfile.php/50733/mod_oucontent/oucontent/550/none/none/deh313_1blk2.12.pdf?

https://physitemp.com/

https://physitemp.com/

https://www.thermocoupleinfo.com/pdf/type-t-thermocouple-reference-table.pdf

https://www.thermocoupleinfo.com/pdf/type-t-thermocouple-reference-table.pdf

https://www.sparkfun.com/datasheets/IC/AD595.pdf

https://www.sparkfun.com/datasheets/IC/AD595.pdf

https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/



Appendix A

PCB Schematic Overview

A schematic overview of the PCB can be seen in Figure A.0.1 below.

Figure A.0.1: PCB Schematic Overview.

66

Appendix B

Training and Validation Loss

Training and validation loss of the LSTMmodel, while using the BGA, job is shown in

Figure B.0.1. The input parameter configurations are: (T1, T2, T3), (I, T1), (I, T2) and

(I, T3).

(a) (b)

(c) (d)

Figure B.0.1: Training and validation loss for different input parameter configurationsusing a BGA job. (a) Input parameters: T1, T2, T3. (b) Input parameters: I, T1. (c)Input parameters: I, T2. (d) Input parameters: I, T3.

67

Appendix C

Performance Without Current

TableC.0.1 contains the results from training theneural networkusing the temperature

data from the three sensors without the piezo current data. The results while using

both the BGAandRT1 job are shown. Three runswere performed for each combination

using the test data, and the average values from the runs are presented in the table. The

metrics used for evaluating the diameter prediction of the LSTMmodel areMAE,MRE

and also the number of predictions where the relative error of the predicted diameter

was greater than 8% (Bad predictions).

Table C.0.1: Test results when only using temperature data.

Job Type

BGA RT1

MAE [µm] 5.77 57.24

MRE [%] 1.51 14.07

Bad predictions 197 32,870

Performance of the LSTM model using the BGA job when only given the

temperature data as input is shown in Figure C.0.1, where Figure C.0.1a shows the

predicted and true diameter and Figure C.0.1b shows the predicted and true diameter

distribution.

68

APPENDIX C. PERFORMANCEWITHOUT CURRENT

(a)(b)

Figure C.0.1: Results from the predictions by the LSTM model when only given thetemperature data as input when using a BGA job. (a) The predicted and true diameter.(b) Distribution of the predicted and true diameters.

The corresponding graphs for the RT1 job are shown in Figure C.0.2, where Figure

C.0.2a shows the predicted and true diameter and Figure C.0.2b shows the predicted

and true diameter distribution.

(a)(b)

Figure C.0.2: Results from the predictions by the LSTM model when only given thetemperature data as input when using an RT1 job. (a) The predicted and true diameter.(b) Distribution of the predicted and true diameters.

69

www.kth.se

Documents

QualityPredictionin JetPrintingUsing NeuralNetworks