Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
DEGREE PROJECT IN MECHANICAL ENGINEERING,SECOND CYCLE, 30 CREDITSSTOCKHOLM, SWEDEN 2020
Quality Prediction inJet Printing UsingNeural Networks
Daniel BrunColin Lawless
KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF INDUSTRIAL ENGINEERING AND MANAGEMENT
AuthorsDaniel BrunColin LawlessKTH Royal Institute of Technology
Place for ProjectTäby, SwedenMycronic AB
ExaminerHans JohanssonStockholm, SwedenKTH Royal Institute of Technology
Supervisor at KTHCarl DuringStockholm, SwedenKTH Royal Institute of Technology
Supervisor at MycronicGustaf MårtenssonTäby, SwedenMycronic AB
Master’s Thesis Coordinator
Fredrik Asplund
Stockholm, Sweden
KTH Royal Institute of Technology
ii
Master of Science Thesis TRITA-ITM-EX 2020:229
Quality Prediction in Jet Printing
Using Neural Networks
Daniel Brun
Colin Lawless
Approved
2020-06-01
Examiner
Hans Johansson
Supervisor
Carl During
Commissioner
Mycronic
Contact person
Gustaf Mårtensson
Abstract Surface mount technology is widely used in the manufacturing of commercial
electronics, and the demands on the machines increase as the complexity of the
electronics increases and the size of the components decreases. Mycronic is a company
that focuses on addressing those demands with their high-technology jet printing and
pick-and-place machines. This master's thesis has been performed at Mycronic and has
focused on the MY700 jet printer. Due to unknown factors, the quality of the ejected
solder paste droplets from the machine can vary over time. It was therefore of interest
to monitor variables of the MY700 in order to gain more knowledge about the cause of
the varying quality, and also to be able to detect substantial changes in deposit quality.
In this project, the temperature has been measured at three key locations on the
ejector as well as the current going through the piezoelectric actuator. This data was
fed to a neural network in order to make quality predictions with respect to the
diameter of the solder paste deposits. Different combinations of sensor data were used
to evaluate how the different sensors affected the performance of the neural network.
Thereby, a better understanding of how big an impact the different variables had on
the quality of the deposits could be achieved.
The results indicate that the current was more significant than the temperature for
making quality predictions. Using only the temperature data, the neural network was
not able to accurately predict quality deviations, whereas with the piezo current data
or both of them combined, better predictions could be made. The current data also
significantly improved the performance of the neural network when printing jobs with
varying diameters were used. The conclusion is that none of the three temperature
sensors significantly improved the performance, and there were no considerable
differences between them, while the current did improve it.
iii
Examensarbete TRITA-ITM-EX 2020:229
Kvalitetsestimering av jetdispenserad lodpasta
med ett neuralt nätverk
Daniel Brun
Colin Lawless
Godkänt
2020-06-01
Examinator
Hans Johansson
Handledare
Carl During
Uppdragsgivare
Mycronic
Kontaktperson
Gustaf Mårtensson
Sammanfattning Ytmonteringsteknologi är en väletablerad metod som används inom tillverkningen av
kommersiell elektronik, och kravet på dessa maskiner ökar i takt med att elektronikens
komplexitet ökar och storleken på komponenterna minskar. Mycronic är ett företag
vars fokus ligger i att möta dessa krav med deras högteknologiska jet printing- och
pick-and-place-maskiner. Detta examensarbete har utförts på Mycronic och har
fokuserat på jet printing-maskinen MY700. På grund av okända faktorer kan
kvaliteten på den deponerade lodpastan från maskinen variera över tid. Det var därför
intressant att övervaka variabler hos maskinen för att få mer kunskap om orsaken till
den varierande kvaliteten och också för att kunna upptäcka förändringar i kvaliteten.
I det här projektet har temperaturen mätts på tre kritiska positioner på ejektorn
samt även strömmen som går genom det piezoelektriska ställdonet. Dessa data gavs
till ett neuralt nätverk för att göra kvalitetsprognoser med avseende på diametern på
deponeringarna av lodpasta. Olika kombinationer av sensordata användes för att
utvärdera hur de olika sensorerna påverkade det neurala nätverkets prestanda.
Därigenom kunde en bättre förståelse av hur stor påverkan de olika variablerna hade
på kvaliteten på deponeringarna uppnås.
Resultaten indikerar att strömmen var mer betydelsefull än temperaturen för att
göra kvalitetsprognoser. Om bara temperaturdata användes lyckades inte det neurala
nätverket göra exakta förutsägelser för kvalitetsavvikelser, medan med bara strömdata
eller båda kombinerade kunde bättre förutsägelser göras. Strömdatan förbättrade
också prestandan hos det neurala nätverket när jobb med olika diametrar användes.
Slutsatsen är att ingen av de tre temperatursensorerna förbättrade prestandan
signifikant, och det fanns inga betydande skillnader mellan dem, medan strömmen
förbättrade prestandan.
iv
Acknowledgements
Firstly, we want to thank Gustaf Mårtensson, Daniel Grafström and Juan Albahaca for
realizing this project and giving us the opportunity to complete it. You have also given
us your support and encouragement throughout the project, for which we are grateful.
We also want to express our deepest appreciation to all our coworkers for providing
expert knowledge and for supporting us when needed.
A special thanks to our supervisor, Carl During, for his support and feedback
throughout the project.
Tack!
Thank you!
v
Contents
1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.7 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Frame-of-Reference 92.1 Jetting Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Piezoelectric Actuator . . . . . . . . . . . . . . . . . . . . . . . . 102.1.2 Jet Printing Quality . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Non-Newtonian Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.1 Solder Paste . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.1 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5 Neural Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.1 Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . 20
3 Methodology 253.1 Research Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Internal and External Validity . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4 Implementation 294.1 Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.1 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.1.2 Red Pitaya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vi
CONTENTS
4.2 Software Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.4 Verification and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4.1 Thermocouples . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4.2 Shunt resistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.4.3 Prediction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5 Results 435.1 BGA Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.2 RT1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.3 Fulfillment of Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 48
6 Discussion 516.1 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 BGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.1.2 RT1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.1.3 Neural Network Performance . . . . . . . . . . . . . . . . . . . . 53
6.2 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Research Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7 Conclusions 57
8 Future Work 59
References 61
Appendices 66
A PCB Schematic Overview 66
B Training and Validation Loss 67
C Performance Without Current 68
vii
List of Figures
1.1.1 Ejector of a jet printing machine. . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 An assembly line solution at Mycronic. . . . . . . . . . . . . . . . . . 2
1.2.1 Temperature sensor placement. . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Current and voltage waveform. . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Simplified illustration of the ejector in a jet printing machine. . . . . 9
2.1.2 Three-phase voltagewaveformof the piezo actuator for a single ejected
solder paste droplet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Two different qualities of a BGA jet printing job. . . . . . . . . . . . . 12
2.1.4 Quality measurement of a single droplet on a substrate. . . . . . . . . 12
2.2.1 Rheological properties of different non-Newtonian fluids. . . . . . . . 13
2.2.2 Shear viscosity as a function of shear rate for two solder paste samples. 14
2.3.1 Simple thermocouple circuit. . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Red Pitaya overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5.1 Typical architecture of a fully connected neural network with one
hidden layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.2 Structure of an artificial neuron in a neural network. . . . . . . . . . . 18
2.5.3 Circuit diagram of a cell in an RNN. . . . . . . . . . . . . . . . . . . . 20
2.5.4 Illustration of the data flow through an LSTM cell. . . . . . . . . . . . 21
2.5.5 LSTMmodels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 Overview of hardware setup. . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.2 Flow chart illustrating the data transfer. . . . . . . . . . . . . . . . . . 31
4.1.3 The PCB used for data gathering. . . . . . . . . . . . . . . . . . . . . . 33
4.4.1 Measured temperature before and after calibration. . . . . . . . . . . 37
4.4.2 Thermocouples response to temperature changes. . . . . . . . . . . . 38
4.4.3 Measured temperature before and after filtering. . . . . . . . . . . . . 38
4.4.4 Positioning of the shunt resistor. . . . . . . . . . . . . . . . . . . . . . 39
4.4.5 Measured current of a single solder paste shot in a BGA job. . . . . . 39
4.4.6 Measured noise from the current sensing. . . . . . . . . . . . . . . . . 40
4.4.7 Calibrated measurement of current. . . . . . . . . . . . . . . . . . . . 40
viii
LIST OF FIGURES
4.4.8 Verification and validation of the LSTMmodel. . . . . . . . . . . . . . 41
5.1.1 Training and validation loss when training the LSTM model using a
BGA job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.1.2 Predicted and true diameter using a BGA job with current data as input. 45
5.1.3 Predicted and true diameter using a BGA job with three different
variable configurations as input. . . . . . . . . . . . . . . . . . . . . . 45
5.1.4 Predicted and true diameter using a BGA job with all input variables
given to the model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.1.5 Predicted and true distribution using a BGA job. . . . . . . . . . . . . 46
5.2.1 Training and validation loss when training the LSTM model using an
RT1 job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2.2 Predicted and true diameter using an RT1 job with all input variables
given to the model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.3 Predicted and true distribution using an RT1 job with all input
parameters to the LSTMmodel. . . . . . . . . . . . . . . . . . . . . . 48
A.0.1 PCB Schematic Overview. . . . . . . . . . . . . . . . . . . . . . . . . . 66
B.0.1 Training and validation loss for different input parameter
configurations using a BGA job. . . . . . . . . . . . . . . . . . . . . . 67
C.0.1 Results from the predictions by the LSTM model when given only the
temperature as input when using a BGA job. . . . . . . . . . . . . . . 69
C.0.2 Results from the predictions by the LSTM model when given only the
temperature as input when using an RT1 job. . . . . . . . . . . . . . . 69
ix
List of Tables
4.2.1 Architecture of prediction model. . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 Variable configuration for the BGA and RT1 job . . . . . . . . . . . . 35
4.3.2 Evaluation cases for the LSTMmodel, where a BGA job with different
input configuration is fed to the model. . . . . . . . . . . . . . . . . . 36
5.1.1 BGA test results for different sensor combinations. . . . . . . . . . . . 44
5.2.1 RT1 test results for selected sensor combination. . . . . . . . . . . . . 47
C.0.1 Test results when only using temperature data. . . . . . . . . . . . . . 68
x
List of Abbreviations
A/D Analog-to-Digital
ASIC Application Specific Integrated Circuit
BGA Ball Grid Array
BNC Bayonet Neill–Concelman
D/A Digital-to-Analog
EMI Electromagnetic Interference
FPGA Field-Programmable Gate Array
GPIO General-Purpose Input/Output
GPU Graphics Processing Unit
I2C Inter-Integrated Circuit
IC Integrated Circuit
IDC Insulation-Displacement Contact
LSTM Long Short-Term Memory
MAE Mean Absolute Error
MRE Mean Relative Error
PCB Printed Circuit Board
PnP Pick-and-Place
PRT Platinum Resistance Thermometer
RNN Recurrent Neural Network
RT1 Robustness Test 1
xi
LIST OF TABLES
SGD Stochastic Gradient Descent
SMD Surface Mount Device
SMT Surface Mount Technology
SPC Statistical Process Control
SPI Serial Peripheral Interface
UART Universal Asynchronous Receiver/Transmitter
xii
Chapter 1
Introduction
This chapter introduces the project, its purpose and its framework. Moreover, it
introduces the company where the master’s thesis project has been conducted, namely
Mycronic, as well as provides a brief overview of their technology.
1.1 Background
Surface mount technology (SMT) is a technology used for producing printed circuit
boards (PCBs), which started to become widely used in the 1980s. Instead of having
components with leads that go through the PCB, so called through-hole components,
surface mount devices (SMDs) are used. SMT is used in virtually all commercial
production of circuit boards due to advantages in regards to size, cost, reliability and
automatability.
There are different ways of applying solder paste for SMD components, one of
which is jet printing. This method utilizes a piezoelectric actuator to operate a piston
which ejects solder paste out of a nozzle. The solder paste dots are ejected at high
frequencies and typically have a volume of 5-20 nl. An example of what a printing
head, also called ejector, can look like can be seen in Figure 1.1.1.
1
CHAPTER 1. INTRODUCTION
Figure 1.1.1: Ejector of a jet printing machine [1].
One company which uses this type of technology is the Swedish manufacturer
Mycronic. Mycronic is a high-tech company that has been producing world-leading
production equipment for display and electronicmanufacturing since the early 80s [2].
In addition toMycronic’smaskwriter, they also offer complete assembly line solutions,
which include the jet printing machine and the Pick-and-Place (PnP) machines [3].
This project was focused on their technology within SMT and, more specifically, the
jet printing machine which is the second unit in the assembly line as seen in Figure
1.1.2.
Figure 1.1.2: An assembly line solution at Mycronic [4].
2
CHAPTER 1. INTRODUCTION
1.2 Problem Formulation
A jet printingmachine deposits solder paste or other assembly fluidswith high accuracy
and good repeatability, but due to unknown factors the quality of the ejected solder
paste droplets can still vary between shots. Thus, it is of interest to monitor variables
of the jet printing machine using a neural network in order to gain more knowledge
about the cause of the varying quality.
The manufacturers of circuit boards have an increased demand on increasing
the density and the complexity in today’s technology. The traditional technique of
detecting defects in PCB production is through statistical process control (SPC), but
it is used off-line which means that defects are detected after the completed process
[5]. In [5], [6] and [7] it is mentioned that real-time detection of process drifts is
preferred in order to make production more efficient and robust. In [5], it is stated
that a neural network has advantages in both accuracy and robustness in the field of
modelling semiconductor processes. It is further explained in [5] that a neural network
can learn to map complex sequences and handle corrupted data.
In order to implement a neural network for predicting the quality of solder paste
deposits, it first has to be decided what variables should be monitored. The choice of
variable is dependent on the availability of the signal and its probability to predict a
certain behaviour. One possibility is temperature. When the temperature of solder
paste varies, so do its rheological properties [8], [9]. Different components and
mechanisms in the ejector are affected depending on how much the temperature
changes during a printing job. A previous study mentions that an increase in
temperature will decrease the viscosity, which will affect the quality of the printing
negatively [9]. Therefore, a possible hypothesis is that data gathered from temperature
sensors in the ejector can be used as an input to a neural network that could be used to
predict changes in jetting quality. The place where the temperature is measured may
have an influence on the performance of the neural network, and therefore three sensor
positions will be compared in this project. These three temperature sensor positions
are based on sensor positions used in a previous master’s thesis at Mycronic [4].
A stepper motor controls the Archimedes screw which feeds the ejector with solder
paste, as seen in Figure 1.1.1. The efficiency of the stepper motor is highly dependent
on the solder paste properties, such as the viscosity of the fluid. Before each shot,
the screw is turned a certain number of degrees and if the temperature changes, the
volume being fed will also change to some extent. The temperature here is affected
by heat generated in the ejector, friction in the paste screw, as well as heat generated
by the stepper motor. Therefore, one temperature sensor was placed after the paste
screw, which is shown as Sensor 1 in Figure 1.2.1. The solder paste being fed to the
screw comes from a tube which has been stored in a refrigerator, and once mounted in
3
CHAPTER 1. INTRODUCTION
the machine is slowly heated up to room temperature before jetting starts. Measuring
the temperature of the paste being fed was therefore also of interest, since this could
also affect the pump. This position is named Sensor 2 in Figure 1.2.1.
As for the piezo-controlled piston, the temperature can have multiple effects. The
rheological properties are a function of temperature. Changes of those properties
will affect how easy or hard it is for the piston to push the paste through the ejector
nozzle, but also the number of undesirable air pockets that are created in the chamber.
These air pockets change how much solder paste is ejected and at what speed, thereby
decreasing the quality. A temperature sensor was therefore placed in the chamber next
to the piston, presented as Sensor 3 in Figure 1.2.1. As changes in temperature have
an impact on the jetted quality due to changes in viscosity [9], having a sensor in this
location was also of interest. The properties of the paste here affect how the paste
exits the nozzle, that is, if there is a tendency for satellites, if the positioning is good,
etc. Satellites are small undesirable solder paste droplets that break off from the main
deposits. This is illustrated in more detail in Section 2.1.2.
Figure 1.2.1: Temperature sensor placement [4].
The displacement of the piezoelectric actuator used in the ejector is controlled
by a voltage reference which follows a predetermined curve. Collected data from
the measured voltage curve, such as rise time and amplitude, could be used to
train the neural network to predict the quality of the jetted solder paste. However,
Mycronic considers the supply voltage to the actuator to be too noisy for qualitative
measurements [4]. An alternative way of measuring the displacement of the
piezoelectric actuator is tomeasure the current required to follow the voltage as seen in
Figure 1.2.2, whichwas performed in [4]. It isMycronic’s hypothesis that themeasured
current variations expresses individual droplet characteristics [4]. The data gathering
from the ejector can also be performed in a non-invasive way.
4
CHAPTER 1. INTRODUCTION
Figure 1.2.2: Theoretical graph of the current and voltage waveform sequence for anindividual jetted solder paste droplet [4].
1.3 Research Question
The research question for this master’s thesis is as follows:
In a piezo-based material depositing device, what are the implications
of the predetermined temperature sensor positions, when providing
supporting data from a current sensor, in regard to increasing the
accuracy of predicting jetted solder paste quality by training a neural
network?
1.4 Requirements
The requirements for this project were decided together with the stakeholders and are
listed below.
• A neural network shall be trained to predict changes in the quality of jetting
deposits which later can be used for real-time prediction.
• Temperature and current shall be measured and the data shall be used as input
to the neural network.
• Three different locations on the ejector shall be used for temperature
measurements.
5
CHAPTER 1. INTRODUCTION
• The quality of the solder paste shall be based on the diameter of the shots and
these shall be measured using a MY700 jet printer.
• Acceptable results from the neural network require the predicted diameter to
vary less than 8% from the actual diameter for individual predictions.
1.5 Delimitations
The delimitations of the project are listed below.
• The MY700 shall be used to gather data for the neural network.
• The focus of the project shall be to only evaluate correlations between two
predetermined input parameters and one output parameter.
• The two inputs shall be temperature and current, while the output shall be the
quality of jetting deposits with respect to diameter.
• The project shall only include two different types of jetting jobs on the MY700,
which are a Ball Grid Array (BGA) job and a Robustness Test 1 (RT1) job.
• The study shall be performed on one machine and one ejector only.
• The neural network shall be trained off-line so that, at a later stage, it can be used
for real-time prediction.
1.6 Methodology
The methodology used in this project was a case study. Two different jet printing jobs
were performed with different conditions for the neural network to be trained on. The
argument for using a case study in this project, and a more detailed explanation of the
jet jobs can be found in Chapter 3.
Theworkload between us has been equally shared in this project and both of us have
been working on the software and the hardware. The main purposes of doing equally
much in all areas were that the both of us should gain knowledge in all fields and that
we could easily share thoughts and ideas during the development of the project. We
have followed the Scrum teammanagement framework in order to assign weekly tasks
and easily perform a follow-up.
6
CHAPTER 1. INTRODUCTION
1.7 Thesis Outline
Chapter 1 provides an introduction to the project and its purpose, as well as an
overview of the company where the master’s thesis has been conducted. Furthermore,
it describes the importance of this project in order to create a framework to further
understand the changes in the quality of jetting deposits. Chapter 2 summarizes the
literature study about the ejector technology, the properties of solder paste, the sensors
used and different types of architectures of neural networks. Chapter 3 explains the
chosen methodology for this project, as well as alternatives. Furthermore, it discusses
the internal and external validity and the procedure of this project. Chapter 4 explains
the procedure of implementing the sensors and extracting the data to the neural
network, as well as the design of it. This chapter also explains the verification and
validation process of each component of the project. The results of howwell the neural
network could predict the quality of the ejected droplets are presented in Chapter
5. Chapter 6 and 7 discuss the findings and link back to the research question and
requirements. Improvements and future work are presented in Chapter 8.
7
CHAPTER 1. INTRODUCTION
8
Chapter 2
Frame-of-Reference
This chapter summarizes the literature study about the ejector technology, the
properties of solder paste, the sensors used, the data acquisition and different types
of architectures of neural networks.
2.1 Jetting Technology
A simplified illustration of an ejector, in a jet printingmachine, is shown in Figure 2.1.1,
where some key components are highlighted. However, a missing key component in
Figure 2.1.1, but which is shown in Figure 1.1.1, is the Archimedes screw which feeds
the solder paste into the chamber from the container.
Figure 2.1.1: Simplified illustration of the ejector in a jet printing machine [4].
In the ejector, which is used in the Mycronic MY700, an Archimedes screw feeds
the chamber with solder paste in a controlled way from the container with solder
9
CHAPTER 2. FRAME-OF-REFERENCE
paste. The piezo expands as voltage is applied to it, causing the piston to move. The
momentum from the piston is transferred into the solder paste in the chamber and the
material is ejected out from the nozzle. The volume of the solder paste in the chamber
is accurately controlled by the Archimedes screw. As the voltage level drops, the piezo
volume is reduced and the spring moves the piston to its initial position [4]. As the
piston moves back up, the process is repeated. The jet printers can eject solder paste
droplets at a frequency of up to 300 Hz, and the volume of the droplets is measured in
nanoliters [10], [11].
2.1.1 Piezoelectric Actuator
In a piezoelectric actuator there are piezoelectric crystals, forming a ceramic, which
expand as voltage is applied and vice versa. Thus, electrical energy is converted to
mechanical displacement. If an alternating voltage is applied to thematerial, it changes
its dimensions cyclically at the frequency of the applied voltage. The frequency atwhich
the piezo most efficiently converts the electrical energy to mechanical displacement
is at its resonant frequency, which is where the impedance is the lowest [12]. The
resonant frequency is determined by the composition of the piezoelectric crystals, as
well as the shape and volume. The main advantages of a piezoelectric actuator is that
it has high precision [12], [13], high force, fast response time [14] and fast acceleration
[15]. A drawback is that they can be affected by hysteresis, that is, that the history of
the electric field, stress and displacement can cause nonlinearity [16].
There are two different types of piezoelectric actuators: stack and stripe [12]. The
stack piezo uses multiple stacked layers of piezo elements and each of these give a
combined effect on the displacement generated from the elements [15], which is shown
in Equation 2.1. Furthermore, the displacement of the stacked piezo is about 0.1–
0.15% of its total length. However, if the path of displacement is blocked, a force is
applied to the blocking object. The movement of a stacked piezo actuator is defined by
∆L = n · d33 · V, (2.1)
where n is the number of stacked piezo elements, d33 is the piezoelectric coefficient
and V is the voltage applied. The stacked piezo actuator can be divided into two
different categories, which are either high or low voltage. The low voltage is rated for
an operating voltage up to 200V and the high voltage is rated for an operational voltage
up to 1000 V. A stacked piezo actuator is categorized as either high or low voltage
depending on the thickness of the piezo element. The thicker the piezo element is,
the higher voltage it can operate at [15]. The stacked type of piezo actuator is used in
the ejectors in Mycronic’s jet printing machines.
10
CHAPTER 2. FRAME-OF-REFERENCE
A striped piezo actuator is configured with two stripes of piezo elements in an
orientation such that when voltage is applied, one of them contracts and the other one
expands [17]. This causes the striped piezo actuator to flex. However, this type is not
used in the ejectors in Mycronic’s jet printing machines.
In Mycronic’s jet printing machines, the stacked piezo actuator is controlled by a
multi-phase-waveform voltage level. An example of a simplified three-phase voltage-
time waveform is shown in Figure 2.1.2.
Figure 2.1.2: Three-phase voltage waveform of the piezo actuator for a single ejectedsolder paste droplet.
2.1.2 Jet Printing Quality
At Mycronic, there are different ways of analyzing the jet printing jobs of the MY700
machine. One option of test jetting that is frequently used is to perform a BGA test,
which entails producing deposits for generic BGA components. This test deposits a
pattern of squares where the dots can have different sizes and distances between them
[4], [18]. Figure 2.1.3a illustrates an approved jet printing job, while Figure 2.1.3b
shows a faulty job. A faulty job canbe confirmed if the deposits of the jobhave an erratic
positioning or size, contain bridges of solder paste between the deposits or contain
satellites. Another type of test is called RT1. This test shoots 12-dot strips with varying
diameters and frequencies. The finished test boards are analyzed by the machine
from which quality measurements such as diameter, satellites, area, positioning and
shape canbe extracted. Another solder paste inspectionmachine can extract additional
quality measurements for each deposit, such as volume. As seen in Figure 2.1.3a, an
approved job has few satellites, consistent pattern and accurate shape. In Figure 2.1.4,
an image of an individual ejected droplet is shown alongwith the presence of a satellite.
11
CHAPTER 2. FRAME-OF-REFERENCE
(a) (b)
Figure 2.1.3: Two different qualities of a BGA jet printing job [4]. (a) Accepted job. (b)Faulty job.
Figure 2.1.4: Qualitymeasurement of a single droplet on a substrate. Positioning error,area, shape and satellites are illustrated [4].
2.2 Non-Newtonian Fluids
Fluids can be divided into two categories: Newtonian and non-Newtonian. A
Newtonian fluid follows Newton’s law of viscosity, that is, that the viscosity of the
fluid is independent of the shear rate [19]. Generally, the viscosity of a Newtonian
fluid is constant at a given temperature and pressure, and examples of such are air
and water. Not all fluids follow Newton’s law of viscosity, and these fluids are referred
to as non-Newtonian. These fluids display a more complex behaviour as they do not
have a constant viscosity at a given temperature and pressure. Instead, the viscosity
is dependent on the flow conditions, such as shear rate, flow geometry and even
kinematic history in certain cases [20].
The study of deformation and flowofmaterial is called rheology. There are different
types of non-Newtonian fluids which have different rheological properties, and they
can be divided into four categories: pseudoplastic, dilatant, thixotropic and rheopectic
fluids [21]. The behaviour of these can been seen in Figure 2.2.1. In a Newtonian fluid
the viscosity, defined as shear stress divided by shear rate, is constant and is therefore
represented by a linear relationship in the graph.
12
CHAPTER 2. FRAME-OF-REFERENCE
Figure 2.2.1: Rheological properties of different non-Newtonian fluids [20].
Pseudoplastic fluids are shear thinning, meaning that as the stress increases, the
viscosity decreases. An example of a shear thinning fluid is ketchup. A similar variant
is yield-pseudoplastics which behave like pseudoplastics, but only after a certain yield
stress. Dilatant fluids are shear thickening and behave the oppositeway comparedwith
pseudoplastics,meaning that the viscosity increases as the stress increases. Cornstarch
mixed with water, also known as oobleck, is an example of this. Both thixotropic and
rheopectic fluids are time-dependent. The viscosity of thixotropic fluids decreaseswith
stress over time and with rheopectic fluids it increases [21]. Examples of a thixotropic
and rheopectic fluids are solder paste and printer ink, respectively.
2.2.1 Solder Paste
Solder paste is a fluid which is composed of amixture ofmetal solder powder, a binder,
flux and other rheological components. The solder particles typically have a diameter
between 10 and30µm in jet printing applications and are produced to be as spherical as
possible [22]. Different alloy types can be used for the solder powder depending on the
application. The binder is used to keep the paste from separating and the flux removes
the oxide layer between the metal and solder as well as accelerates the wetting of the
metal [23]. The composition of solder pastes affects their rheological properties, and
the exact composition is generally not disclosed by the companies that produce them.
In order to provide a solder paste to Mycronic’s customers that fit their application,
they cooperate with other companies that produce solder pastes [4].
The composition of solder paste gives it a non-Newtonian behaviour. When it is
exposed to a shear stress, it exhibits a thixotropic behaviour, or in other words, the
viscosity decreases over time. Using shear sweeps, Mycronic has tested two solder
13
CHAPTER 2. FRAME-OF-REFERENCE
paste samples for shear viscosity as a function of shear rate which can be seen in
Figure 2.2.2. The viscosity decreases as the shear rate increases and this confirms the
thixotropic behaviour. One reason for this behaviour is that when no shear stress is
applied, attractive forces between the metal particles create flocs of particles which
increases the viscosity. As shear is applied these flocs break apart which decreases
the viscosity of the paste. Once the shear is removed flocs begin forming again and
the viscosity increases. However, the structure of the flocs might change which would
mean that viscosity does not fully return to the same state as before [24].
Figure 2.2.2: Shear viscosity as a function of shear rate for two solder paste samples[4].
2.3 Sensors
This section describes the sensors that were considered for data gathering in the
project.
2.3.1 Temperature
A common choice for temperature measurements is a thermocouple. Thermocouples
work by having a closed circuit of two dissimilar metals, as can be seen in Figure 2.3.1.
If there is a difference in temperature between the two junctions of the thermocouple,
a voltage will be produced between the two metals due to the thermoelectric effect,
which can be measured at one of the junctions [25]. This voltage can then be used to
determine the temperature at the opposite junction. The combination of metals used
in the sensor affects the voltage produced, and this can vary between sensors. The
14
CHAPTER 2. FRAME-OF-REFERENCE
main advantages of thermocouples are that they are robust, relatively inexpensive, can
measure a wide range of temperatures and are self-energized. The disadvantages with
the sensors are that the signal is weak which makes them sensitive to electrical noise
and also that the output is non-linear and requires amplification. Two other types of
temperature sensors are platinum resistance thermometers (PRTs) and thermistors.
The basic principle for both of these sensors is that their resistance is dependent on
temperature. However, PRTs aremore expensive than thermocouples and thermistors
cannot measure as wide of a temperature range [26].
Figure 2.3.1: Simple thermocouple circuit [25].
2.3.2 Current
There are several principles for measuring current, but the most common method
is using a shunt resistor [27]. A shunt resistor is a low resistance resistor used for
determining the current through the resistor by measuring the voltage drop over it.
Ohm’s Law states that
V = I ·R, (2.2)
where V is the voltage drop over the resistor, I is the current through it and R is
the resistance. This means that the voltage changes proportionally with the current.
The advantages of shunt resistors are that they are inexpensive, robust and have high
accuracy. Some things to be aware of when using them are that there is a power loss
which is proportional to the square of the current, which means that they are generally
not suitable for measuring high currents. Furthermore, the resistance could vary due
to factors such as aging or changes in temperature, which affects the precision of the
measurement [28]. Other methods of measuring current include using Hall effect
sensors to measure changes in the magnetic field created by the current, as well as
using sensors based on Faraday’s Law where transformers are utilized.
2.4 Data Acquisition
A powerful measurement tool that is able to make multiple measurements
simultaneously and features similar standard as laboratory equipment is the Red
15
CHAPTER 2. FRAME-OF-REFERENCE
Pitaya. TheRedPitaya is a single board computerwhich is intended to be an alternative
to the more expensive laboratory equipment. It is an open-source instrumentation
platform that can measure or test a variation of tasks [29]. The Red Pitaya has a built
in signal generator and pre-developed apps can be downloaded from the web page or
one can develop one’s own apps [30]. Depending on the version, there are two 14-bit
or 10-bit analog-to-digital (A/D) and digital-to-analog (D/A) converters on the board
that can measure tasks at a sampling rate of 125 MHz [29]. These fast input channels
have a bandwidth of 50 MHz. The Red Pitaya also has two extension connectors,
which have access to four slow analog inputs, four slow analog outputs, 16 General-
Purpose Input/Output (GPIO), Inter-IntegratedCircuit (I2C), Universal Asynchronous
Receiver/Transmitter (UART) and Serial Peripheral Interface (SPI) [29], [31]. These
slow input channels have a bandwidth of 50 kHz. So, the Red Pitaya is a useful
measurement tool if there is a demand of high performance signal processingwith high
frequency signals of up to 50 MHz [29]. Figure 2.4.1 shows the hardware overview of
the Red Pitaya where some components are highlighted.
Figure 2.4.1: Hardware overview of the Red Pitaya [31].
The Red Pitaya features the Xilinx Zynq 7010, which is pointed out in Figure 2.4.1.
This system combines a Field-Programmable Gate Array (FPGA) and a multi-core
processor. The advantage of FPGAs is that they can be reprogrammed for a desired
task after it has been manufactured [32]. In other words, an FPGA that is working as
a microprocessor can, for example, be reprogrammed to work as a graphics card. The
more common technology is the Application Specific Integrated Circuit (ASIC) where a
component is designed for only one purpose throughout its lifetime [32]. One example
of that technology is the graphics processing unit (GPU) inside amodern phone, where
the logic cannot be reprogrammed to work as another component.
16
CHAPTER 2. FRAME-OF-REFERENCE
2.5 Neural Network Architecture
The main objective of a neural network is to recognize patterns. This is made possible
by first having the neural network learn from a series of defining sets of input and
output correspondences. The neural network can then apply what it has learnt to new,
and unseen, input data to predict a relevant output [33]. A typical neural network is
seen in Figure 2.5.1. The structure of neural networks consists of an input layer, one or
more hidden layers, an output layer and interconnections between nodes of different
layers.
Figure 2.5.1: Typical architecture of a fully connected neural network with one hiddenlayer.
The training process of a neural network can be divided into two categories:
forward-propagation and back-propagation. During forward-propagation the
information is sent through the neural network and a prediction is made. The process
from the input layer to the output layer is such that the input layer first receives
information from an external source. That information is passed, via the connections,
to nodes of the hidden layer, which processes all the information. Lastly, the output
layer receives the processed datawhich is given to the user. The path of the information
from the input layer, through the hidden layer, to the output layer is determined by the
strength of the interconnections between nodes. Each node has a set of weights which
determines the importance of its inputs, as well as a bias which adjusts the output.
When a node in the input layer receives information, it is activated. That triggers
a signal by the activation function to be emitted to its neighbouring nodes. This signal
is either excited or inhibited depending on the strength of the interconnection, that is,
17
CHAPTER 2. FRAME-OF-REFERENCE
themagnitude of the weights and biases. This process continues on through the neural
network, which creates a pattern of activation that manifests itself in the output layer
[33]. The forward-propagation in a neural network is defined, mathematically, by
a(l) = g(a(l−1);Θ), (2.3)
where g is the activation function, a is the preactivation, l denotes the layer and Θ
represents the parameters, or in otherwords, theweights and biases. The preactivation
is a weighted sum of the inputs to the layer. Figure 2.5.2 shows the principle of
actions in an artificial neuron. First, the weighted sum of the input parameters,Θn, is
calculated and then passed through an activation function, g.
Figure 2.5.2: Structure of an artificial neuron in a neural network.
Examples of activation functions are sigmoid, tanh and linear, which are defined
by Equation 2.4, Equation 2.5 and Equation 2.6 respectively.
g(z) =1
1 + e−z(2.4)
g(z) = tanh(z) (2.5)
g(z) = z (2.6)
The first two activation functions are non-linear functions, whose purpose are to
introduce non-linearity into the neural network. Equation 2.6, on the other hand, is
a linear activation function. If only linear activation functions are used in the hidden
layers of a neural network, the output will just be a linear transformation of the input.
In other words, a composition of successive linear transformations is equivalent to
one linear transformation, which means that complex non-linear problems cannot be
accurately mapped between input and output. Moreover, most real world problems
are highly complex and non-linear, which is why non-linear activation functions are
required in, at least, the hidden layers of a neural network. However, a linear activation
18
CHAPTER 2. FRAME-OF-REFERENCE
function can be used in the output layer if a continuous value shall be predicted.
When this process is finished and the error of the prediction, that is the loss, has
been calculated, the model has done its forward-propagation. However, in order to
learn, that is, update its weights, back-propagation is needed. The purpose of back-
propagation is to minimize the error that is propagated from each node to the total
error [33]. This is made possible by a technique named gradient descent, which tunes
the weights in order to minimize the loss function which evaluates how the model is
performing. Minimizing the loss function is, thus, an optimization problem in terms of
tuning the weights of the neural network. Since the loss function is a summation of the
prediction errors by the neural network, the lower the loss, the better the performance
of the neural network. Examples of methods to calculate the loss are mean absolute
error (MAE) and mean relative error (MRE). These two methods are defined as
MAE =1
n
n∑i=1
|yi − yi|, (2.7)
MRE =1
n
n∑i=1
|yi − yi|yi
, (2.8)
respectively, where n is the total number of data points, y is the true value and y is the
predicted value.
At each update, how much to modify the model with respect to the estimated error
is determined by the learning rate. An excessive learning rate can cause an unstable
training process, whereas a rate that is too low will require a longer training process.
Thus, the main idea behind training neural networks is to minimize the loss function
bymodifying the parameter of themodel and in turnmaximizing the accuracy [34]. As
one iteration of forward- and backward-propagation is completed, the neural network
has completed one epoch of training.
While training neural networks, the model is likely to overfit if there is no
regularization. When overfitting, the model performs well on the training data but
poorly on the new, unseen, data. This can be seen as a decreasing training loss,
but constant or increasing validation loss while training. This means that the neural
network has only memorized the training data rather than generalized on new data.
To minimize the risk of overfitting, different regularization techniques could be used,
such as dropout or L2 regularization. Implementing dropout randomly removes
connections in the neural network during training. Thus, the neural network cannot
rely on the connections between nodes, which prevents it from overfitting. The L2
regularization method dynamically penalizes the weights, such that large weights are
penalizedmore and vice versa. Aswith dropout, the L2 regularization also decorrelates
the neural network.
19
CHAPTER 2. FRAME-OF-REFERENCE
2.5.1 Recurrent Neural Network
Recurrent neural networks (RNNs) are neural networks that are specialized in
processing sequences of datawhich canhave variable lengths [35]. Themain difference
between the structure of anRNNandother neural networks is that the nodes of anRNN
have a recurrent connection, which stores previous calculations and, thus, functions
as a memory. This results in the RNN having two inputs, the present and the recent
past. The additional input about the past holds valuable information about the future
[36]. Figure 2.5.3 shows a cell in an RNN, which has the recurrent connection that is
different from other neural networks, as seen in Figure 2.5.1.
Figure 2.5.3: Circuit diagram of a cell in an RNN. Here, xt is the input at time t, ht isthe state of the hidden layer at time t and ot is the output at time t. Parameters for theinput, hidden layer state and output areΘi,Θh andΘo, respectively [37].
The graphical model of an RNN cell in Figure 2.5.3 can be explained with the
following equations:
ot = f(ht;Θ), (2.9)
ht = g(ht−1,xt;Θ). (2.10)
In Equation 2.9 and Equation 2.10, ot is the output of the RNN at time t, f and g
are activation functions, ht is the state of the hidden layer at time t, xt is the input
at time t andΘ represents the weights and biases. Equation 2.9 shows that the output
is dependent on the weights and biases and also the state of the hidden layer at time t.
However, Equation 2.10 shows that the state of the hidden layer at time t is dependent
on the weights and biases, input at time t and the state of the hidden layer at time t−1.
The latter equation is what differentiate RNNs from other neural networks since the
previous state of the hidden layer, ht−1, has influence on the current state of the hidden
20
CHAPTER 2. FRAME-OF-REFERENCE
layer, ht [37]. This demonstrates that the RNNs have a memory.
However, there are two major drawbacks of the RNN architecture: vanishing and
exploding gradients. Both of these issues can occur only during the back-propagation
phase if there are long-term dependencies, that is, it has to memorize a long sequence.
So, the vanishing or exploding gradients occur due to multiplication in the chain rule
of the partial derivatives in the back-propagation through time [37]. Gradients that
are less than one shrink exponentially due to continuous matrix multiplication until
the gradients vanish. The same applies for the exploding gradients when the gradients
are greater than one, but then the gradients start increasing and eventually cause a
numerical overflow. A solution to this issue is to choose an alternative recurrent neural
network, namely long short-term memory (LSTM).
Long Short-Term Memory
The LSTM architecture is a gated version of the RNN architecture, which addresses the
issue of long-term dependencies [35], [38]. This implies a more complicated structure
of the cell than in RNNs. As seen in Figure 2.5.4, the cell consists of three different
gates: a forget gate, an update gate and an output gate. The gates consist of either a
sigmoid or a tanh function (see Equation 2.4 and Equation 2.5 respectively) in order to
control the flow of information through the LSTM cell. These two types of activation
functions in the LSTM cell also introduce non-linearity to the neural network.
Figure 2.5.4: Illustration of the data flow through an LSTM cell. The three differentgates are highlighted: input gate, update gate and output gate [39].
The inputs of the LSTM cell are the current input, xt, the previous hidden state,
ht−1, and the previous memory state, ct−1. The outputs are the current memory
state, ct, and the current hidden state, ht. The core concept of an LSTM cell is that
information can be passed forward on the cell state memory line, shown as the top
horizontal line in Figure 2.5.4, and information can either be removed or added by
21
CHAPTER 2. FRAME-OF-REFERENCE
the forget and update gates, respectively. This enables relevant information to be
transferred, touched or untouched, throughout the processing of the sequence, which
addresses the problem of long-term dependencies with RNNs [35]. Figure 2.5.4 shows
one ofmany cells that can be connected in series, which can be simplified by a recurrent
connection as in Figure 2.5.3. Thus, the illustration in Figure 2.5.3 can be extended by
removing the recurrent connection and adding as many cells as the length of the input
sequence in series. The cell outputs in Figure 2.5.4 can be expressed mathematically
by:
ct = ft ⊙ ct−1 + it ⊙ gt, (2.11)
ht = ot ⊙ σc(ct) (2.12)
where the forget gate, ft, the update gate, it and gt, and the output gate, ot, are defined
as
ft = σg(ht−1,xt; Θf ), (2.13)
it = σg(ht−1,xt; Θi), (2.14)
gt = σc(ht−1,xt; Θg), (2.15)
and
ot = σg(ht−1,xt; Θo), (2.16)
respectively, where σg is the gate activation function and σc is the state activation
function.
Different types of LSTM models include vanilla, stacked, encoder-decoder and
bidirectional. Vanilla LSTMs are often referred to as the default or standard version
of the architecture and consists of an input layer, one fully connected hidden LSTM
layer and a fully connected output layer. This is the simplest version of an LSTM and
is generally a good starting point for solving a problem. Models with more than one
hidden LSTM layer are referred to as stacked LSTMs. The advantage of having more
than one layer is that it improves the success of a neural network. Additionally, having
several small layers is generally more efficient than having one large layer [40]. The
layout for these two models can be seen in Figure 2.5.5a and Figure 2.5.5b.
The encoder-decoder model is useful for sequence-to-sequence problems, that is,
when the input is a sequence of values and the goal is to predict the coming sequence
of values. This architecture contains an encoder model which processes the input and
encodes it into a vector with fixed length. This vector is then given to the decoder
model which decodes the vector and gives the predicted sequence. Themain use of the
22
CHAPTER 2. FRAME-OF-REFERENCE
architecture is natural language processing and text translation, its layout can be seen
in Figure 2.5.5c. The encoder-decoder model has been found to occasionally be more
efficient when the input is reversed, and this phenomena is utilized in the bidirectional
model. In thismodel the input is fed to two layerswhich are side-by-side, as can be seen
in Figure 2.5.5d. The forward input sequence is given to the first layer and a reversed
version of the input sequence is given to the other layer. This method has been known
to increase the performance of a neural network, but it does require that the entire
input is available [40].
(a) (b) (c) (d)
Figure 2.5.5: LSTM models [40]. (a) Vanilla LSTM. (b) Stacked LSTM. (c) Encoder-decoder LSTM. (d) Bidirectional LSTM.
23
CHAPTER 2. FRAME-OF-REFERENCE
24
Chapter 3
Methodology
The following chapter presents the methodology used in this project. The chosen
research strategy, the internal and external validity, as well as an overview of the
procedure used are presented and discussed.
3.1 Research Strategy
It is of importance to choose a methodology in research since it explains what type of
systematic approach that is being used to solve the problem. In other words, a work
plan to address the research problem by defining the procedure of methods by which
knowledge is obtained [41]. The methodology also defines the quality assurance of the
project, that is, the validation and verification of the research material [42].
The methodology used in this project was an empirical quantitative research
approach, utilizing a case study as the research strategy. In [43], the purpose of
quantitative research is defined to study relationships, cause and effect. It is also
mentioned in [42], [43] and [44] that quantitative research is characterized by large
data sets. Considering that the purpose of this project was to gather large amounts
of data and to create a neural network to examine the effect certain parameters have
on the quality of jetting deposits, a quantitative research approach was deemed to be
suitable.
A case study was chosen due to its usefulness when doing an empirical study of a
particular phenomenon using multiple sources of evidence [42]. The phenomenon to
be studied in this projectwas how themeasured temperature andpiezo current affected
the accuracy in predicting the quality of jetting deposits. In [45], it is stated that a case
study is beneficial if knowledge shall be obtained regarding a new phenomena. This
project is the first at Mycronic that investigates the potential benefits of applying a
neural network in their jet printing machine to predict the quality of the deposits. The
use of a case study is also supported by [46], in which it is stated that a case study will
25
CHAPTER 3. METHODOLOGY
give indications on hypothesis creation. Since the intended purpose of this project was
to create a hypothesis regarding how the defined sensors improved the ability to predict
quality in jet printing machines, a case study was appropriate for achieving this. Using
the knowledge gained in this project, the created hypothesis can be examined further
in future projects, which is discussed in Chapter 8.
An alternative research method is using experiments. Experimental methods deal
with the relationship and effects between variables as they aremanipulated [42]. Since
it is not within the scope of this project to manipulate variables, such as temperature
or current, an experimental method was not chosen. It is also mentioned in [47]
that an advantage with using case studies over experiments is that only naturally
occurring cases are investigated, rather than cases created by the researcher. Thus,
the realism of the project can be assured by using a case study. Additionally, the causal
hypotheses generated by case studies can sometimes enable researchers to recognize
causal relationships in a way that is not possible in experimental research [47].
There were two different types of jet printing jobs used for collecting the data, both
of which are frequently used byMycronic’s test engineers. Onewas a simpler job, while
the other was amore complex and realistic job. The simpler one was a BGA test, where
the shots had a constant diameter and were shot at a constant frequency. Since a
majority of jobs used by customers do not have constant diameters and frequencies,
this job was considered to be a simple test. The other job was the RT1 job, where the
diameter of the shots varied between five different values and the frequency between
three different values. Both tests are explained in more detail in Section 2.1.2. These
tests are used by Mycronic to evaluate the robustness of the machine, but they can
also be considered to be realistic tests since they simulate the machine usage by the
customers. Varying variables are more difficult for the neural networks to predict. By
performing these two types of jobs, a clearer and more unambiguous evaluation of
the neural network could be made. Since the tests were performed on a real machine
using real jet jobs used by test engineers, the tests could be generalized for the given
combination of ejector and solder paste used in this project.
3.2 Internal and External Validity
In order to provide an answer to the research question, this project used a case study
which was split into two different tests with different amounts of varying variables.
The two cases were chosen such that the performance of the neural network could be
verified and also so that the realism and internal validity of the project would increase.
As mentioned in [48], the internal validity of a project can increase if multiple data
sources of the same method are used. By using the two different jobs, the project
26
CHAPTER 3. METHODOLOGY
covered a broader range of different usages of the MY700. The BGA job was useful to
first find potential correlations between the sensor data and the quality of the deposits,
while the RT1 both checked the robustness of the neural network and simulated the
real environment it would be active in. Another factor that was important for the
internal validity of this project was that the created neural network was capable of
finding correlations in the data if there were any. The ability of the neural network
to recognize patterns was verified and is described in Chapter 4, but it is possible that
other architectures than the one chosen would be even better at finding correlations.
In [49], it is mentioned that external validity relates to how generalizable a study
is. After discussion with the stakeholders at Mycronic, it was decided that the main
focus of the project was hypothesis creation and to create a framework upon which
further studies can be performed. Due to time and resource limitations, it was not in
the scope of the project to evaluate with other hardware setups, such as using different
machines, ejectors or solder pastes. This could however be done in future projects to
further examine the generalizability of the results and to improve the external validity.
3.3 Procedure
The first phase of the project was to create the project formulation, decide on a
methodology and create a project time plan. After that, a background study about
ejector technology, the sensors that were used and neural network architectures was
done. Since data collection was one of the main components of this project, it was of
high importance to carefully study both what data points to get and also how to process
them. After the prestudywas finished, the test setupwas created anddatawas gathered
from a MY700. The temperature sensor data was gathered from three different places
on the ejector, as explained in Section 1.2 and shown in Figure 1.2.1, and a shunt
resistor was used to measure the current through the piezoelectric actuator. When
this configuration had been built, the two types of jobs explained above were run. The
data from the sensors and the jetting deposits was given to the neural network and
training was performed. Finally, the accuracy of the neural network was evaluated and
conclusions were drawn.
During the design of the project the replicability has been thought of to enable
future improvements to the project. It is mentioned in [42] that replicability is one
of several quality assurances to be aware of when designing the methodology. For this
project, the procedure for how to setup the hardware components in order to gather
data is documented and also the verification and validation process of each component.
Furthermore, a seminar has been held at Mycronic for the engineers, for the purpose
of transferring knowledge of the project and how to replicate the tests.
27
CHAPTER 3. METHODOLOGY
28
Chapter 4
Implementation
The following chapter will first explain the implementation of the hardware and
software that have been developed and used throughout the project in Section 4.1
and Section 4.2, respectively. Furthermore, the procedure to perform the cases is
clarified in Section 4.3. Lastly, verification and validation of units of the system will be
presented in Section 4.4.
4.1 Hardware Configuration
This section describes how the machine, cassette and ejector were modified in order
to install the sensors. An overview of the hardware system is shown in Figure 4.1.1
and highlighted components are the Red Pitaya, the trigger signal from the piezo PCB
driver and the cassette containing the solder paste and ejector. The cassette contains
the current sensor and the ejector contains the temperature sensors. Furthermore,
a PCB was developed in order to enable the Red Pitaya to read the values from the
temperature sensors as well as the trigger signal from the MY700.
29
CHAPTER 4. IMPLEMENTATION
Figure 4.1.1: The hardware setup in the MY700 for data gathering.
Modifications to theMY700 included two Bayonet Neill–Concelman (BNC) coaxial
connectors that were connected to the debug pins of the piezo PCB to extract the trigger
signal, which is marked as ”Trigger signal” in Figure 4.1.1. This signal is used by the
Red Pitaya in order to initiate sampling from the sensors. The Red Pitaya wasmounted
on top of the horizontal beamwhich moves the printing head back and forth, as shown
in Figure 4.1.1. A power cable and an Ethernet cable were run via the wiring harness
out through the back of the machine. The wires for the sensors along with the coaxial
cable for the trigger signal were run the opposite direction up to the printing head.
Cable ties were used to attach the cables and extra caution was used to ensure that no
cables came in the way of the machinery.
Figure 4.1.2 illustrates the data transfer between the different hardware
components. The connection between the MY700 and the database, labeled ”Job
data”, represent the quality measurements taken by the machine. The data is
obtained through a series of photographs taken by the machine on the deposits. The
images undergo a processing stage to extract quality measurements, such as diameter,
positioning, shape and satellites. These measurements are then sent to the database
to be stored together with its corresponding sensor data.
30
CHAPTER 4. IMPLEMENTATION
Figure 4.1.2: Flow chart illustrating the data transfer between different components.
4.1.1 Sensors
In Chapter 1, it was explained that three positions for the temperature measurement
would be used, as shown in Figure 1.2.1. The positioning of the sensors was determined
by the work in [4], but a more extensive explanation of the importance of the
positions has been made in Chapter 1. Since these positions include components
and mechanisms that can be affected by changes in the rheological properties, which
is a function of temperature, it was decided to maintain these temperature sensor
positions. However, it was noticed that themounting of the sensors could be improved.
The previous assembly was such that the sensors were inserted into the drilled holes
and a glue gun was used to install them in that position. This setup ran the risk of
potentially insulating the sensors from the ejector chassis if some glue had entered the
hole. Thus, a new ejector was modified with the same configuration as in [4], but a
heat conductive paste was added into the holes with the sensors before fastening them
with a glue gun. This ensured a more accurate temperature reading from the sensors.
To measure the temperature, IT-18 thermocouples [50] were used which have an
accuracy of ±0.1 °C. These were the same type of sensors as the ones used in [4],
and the blue arrows in Figure 4.1.2 show the data measured by the thermocouples
being transferred to the Red Pitaya. In this project, the thermocouples measured
temperatures in the range of 20 °C to 40 °C. The signals from the thermocouples in
that range were between 1.196 mV to 1.612 mV [51]. These signals had to be amplified,
which is explained in Section 4.1.2 below. The thermocouple that was placed by the
ejector nozzle protruded slightly from the bottom of the ejector. This meant that the
distance between the nozzle and the substrate had to be increased from the default
value of 650 μm to 800 μm when operating the MY700.
31
CHAPTER 4. IMPLEMENTATION
To measure the current, it was decided to keep the setup used in [4] which was to
have a shunt resistor on the low side of, and in series with, the piezoelectric actuator.
This reduced any issues with the common-mode voltage. If the current and resistance
are not too high, a shunt resistor also dissipates low amounts of power since
PD = I2 ·R, (4.1)
where PD is the dissipated power, I is the current and R is the resistance. Thus, there
is less possible influence on the circuit. Two shunt resistors were mounted in parallel,
each with a resistance of Rs = 0.1 Ω, which resulted in a total resistance of Rs = 0.05
Ω. The green arrow in Figure 4.1.2 shows how the data measured by the shunt resistor
is transferred to the Red Pitaya.
4.1.2 Red Pitaya
It was decided that the same Red Pitaya as the one used in [4] would be used for
this project as well. The three thermocouples were connected to the Red Pitaya via
a PCB which was mounted on top of the Red Pitaya, as can be seen in Figure 4.1.3.
Three thermocouple amplifiers with cold junction compensation were used to amplify
the signals from the thermocouple. Cold junction compensation means that these
integrated circuits (ICs) use an ice point reference to provide a temperature reference
for the thermocouples, which was needed in order to make temperature readings.
Once amplified, a 10 mV change of the output signal corresponded to a 1 °C change
in temperature [52]. All output signals were then given to the analog input pins of the
Red Pitaya. A BNC coaxial connector was mounted on top of the PCB and connected
to one of the digital input pins of the Red Pitaya. This was used to register a trigger
signal which was sent out from the machine every time a shot was ejected. This trigger
signal let the Red Pitaya know when to take a measurement and is shown as the red
arrow in Figure 4.1.2. The temperature and trigger signals, along with ground and
5 V signals, were connected between the PCB and Red Pitaya using two flat cables
with a 26-way insulation-displacement contact (IDC) connector plug at each end. The
current sensor was connected directly to one of the 14-bit fast channels of the Red
Pitaya which had a sampling rate set to 15.6 MHz. This was done using another BNC
type connector, which is marked as the green arrow in Figure 4.1.2. The low voltage
input (±1 V) of the fast channel was used, since the signal would never exceed ±1
V with the chosen shunt resistor. The signal was further downsampled to 3.9 MHz
to reduce memory usage and increase efficiency. A sampling rate of 3.9 MHz was
deemed to be sufficient since this would give 390 samples, which for each current curve
would resolve the important trends. A 14-bit resolution was also deemed to be enough
32
CHAPTER 4. IMPLEMENTATION
since this gave a resolution of approximately 0.12 µV. Observations showed that this
sampling frequency and resolution gave a smooth and continuous curve. A schematic
overview of the Red Pitaya can be found in Appendix A.
Figure 4.1.3: The PCB used for data gathering.
4.2 Software Configuration
The two main scripts that were used in this project were the code for running the
MY700 and collecting data from the sensors during operation, and another for the
neural network. Both scripts were written in Python and the neural network was
designed with Tensorflow, which is an open-source software library for machine
learning in Python.
The first script had already been developed, but was designed for the earlierMY600
machine so some changes had to be made to adapt it to the MY700. When the MY700
was finished with a job, the data recorded by the Red Pitaya was stored in an open-
source database called MongoDB. Completing this step made the MY700 continue its
process by scanning the quality of the deposits and also send this data to the database,
as seen in Figure 4.1.2. Uploaded data could easily be accessed for furthermodification
before being fed to the neural network.
The design of the neural network followed the architecture of a stacked LSTM
model. It was decided to use an LSTM model for this project since its architecture
minimizes the potential risk of vanishing or exploding gradient compared to other
recurrent neural networks, and for its advantages in processing a sequence of data. The
different architectures of LSTM models have been evaluated by manually comparing
33
CHAPTER 4. IMPLEMENTATION
their performance, but it was concluded that stacked LSTM was the better performing
model on the data. The model predicted the diameter of the next shot based on the
data from the thermocouples and the current sensor. It was given a window size
of 10, that is, the number of previous values used for making predictions. For RT1,
the number of epochs was increased from 25 to 35 since it needed more time for the
loss to stagnate. Different optimizers, such as stochastic gradient descent (SGD) and
Adam have been evaluated for this problem and Adam showed the most promising
results. This optimizer has the advantage of adapting individual learning rates for
each parameter of the neural network instead of only using a constant learning rate
for all parameters as in, for example, SGD. The loss function in the neural network
was calculated using MAE, which is defined in Equation 2.7.
A summary of the LSTM architecture and its parameters that were used and
evaluated to suit the purpose of this project is shown in Table 4.2.1. This design was
determined bymanually tuning parameters of the network, deciding number of layers,
selecting the window and batch size, and recording the performance in order to decide
on a design. For the activation function in the output layer, both tanh and linear was
evaluated but linear showed the best results. The possible reason for that was that the
model predicts a continuous value and, thus, as described in Section 2.5, the linear
activation function could be used in the output layer.
Table 4.2.1: The LSTM architecture for evaluation.
TypeNetwork
architectureOptimizer
Window
sizeBatch size Epochs
Stacked
LSTM
Layer 1: 200
Dropout: 0.4
Layer 2: 200
Dropout: 0.4
Dense: 1
Activation: Linear
Adam 10 256 25/35
Before the data was fed to the neural network, it was standardized with a mean
of zero and a standard deviation of one. Since there were different ranges of the
inputs, it was beneficial for the neural network to receive standardized data to not
form a bias toward any of the inputs while predicting quality. It helped in the back-
propagation phase so the neural network converged more easily. After the data had
been standardized, it was split into training, validation and test data where the training
data was about 80% of the dataset, which is a commonly used distribution of the data.
During the training phase of the LSTMmodel, the training data was used and for each
34
CHAPTER 4. IMPLEMENTATION
completed epoch, validation of the learning processwas performedusing the validation
data. After each epoch, the loss was calculated and registered for both the training and
the validation process. When all epochs were completed, that is, the training of the
LSTM model was finished, the model was given the unseen test data which evaluated
its performance. The input data to the model had to have three dimensions: batch
size, time steps and features, where batch size is the number of samples per iteration,
time steps is the number of past time steps in one sample and features is the number
of observations in one time step.
4.3 Experimental Procedure
The case study examined two different cases, which are presented in Table 4.3.1 with
their associated variable configuration. The first case was a BGA job which was run
with constant variables while the second case was an RT1 job where the frequency and
diameter varied between 160-300 Hz and 330-520 µm, respectively. In both of these
cases, the measured variables were the temperature and the current.
Table 4.3.1: Variable configuration for the BGA and RT1 job. The BGA job has onecombination and the RT1 job has 15 unique combinations.
Diameter [µm] Frequency [Hz]
BGA job 380 200
RT1 job
330
370
429
482
520
160
230
300
The procedure was almost identical for the two cases. The MY700 was initialized
by first performing an extended purge which made sure that the ejector was filled with
solder paste. This was followed by a machine calibration procedure to ensure that the
solder paste deposits were acceptable. From the Python script, either the BGA or RT1
job was selected to be performed by the MY700 and it followed the instructions given
in Section 4.2. The BGA job created a 6x10 grid of generic BGA patterns where each
BGA contained 460 dots, as shown in Figure 2.1.3a. This resulted in 27,600 shots per
BGA job. The RT1 job consisted of several rows of solder paste deposits with different
combinations of diameter and frequency, in total 15 unique combinations and 34,200
35
CHAPTER 4. IMPLEMENTATION
shots. Several runs were made for both job types, and data was collected from almost
400,000 shots in total for each job type. Out of these, the last 50,000 shots were
used as test data to evaluate the performance of the neural network after training was
completed. For both BGA and RT1, each deposit had one temperature data point per
temperature sensor and 390 current data points. When the data from the runs were
available in the database, it was downloaded and given to the neural network. How the
data was managed before fed to the neural network is explained in Section 4.2.
The development of the neural network was an iterative process in terms of
parameter settings. However, Table 4.2.1 shows the architecture together with its
parameters that had the best performance on the data. In order to answer the research
question, five cases for the LSTMmodel were designed as seen in Table 4.3.2. The first
case has the current data as input and served as a baseline. The next three cases used
data from each individual temperature position with supporting data from the current.
Finally, a test with all of the sensors was performed to see if all the sensors together
increased the performance. The cases in Table 4.3.2 were performed with a BGA job.
The case that showed the most promising performance was tested with the RT1 job.
This was done in order to test the robustness of the best LSTMmodel found.
Table 4.3.2: The BGA job evaluation cases for the LSTM model, each having differentinput configurations to the model.
Parameter
Case Current Temp 1 Temp 2 Temp 3
1 X - - -
2 X X - -
3 X - X -
4 X - - X
5 X X X X
36
CHAPTER 4. IMPLEMENTATION
4.4 Verification and Validation
This subsection will explain the procedure for verification and validation of the sensors
in order to meet the requirements and to answer the research question.
4.4.1 Thermocouples
Before the thermocouples were mounted on the ejector, they were tested and
calibrated. The calibration procedure started with connecting the sensors to the PCB.
The tips of the sensors were kept at the same position during calibration. A reference
thermometer was used to register the ambient air temperature. The probe of the
thermometer was placed together with the sensors. Before any calibration tests were
performed, the thermometer had to adopt the surrounding temperature. Within two
hours the thermometer had stabilized at 23.6 °C. The calibration was based on three
tests, where the mean value of the data from each thermocouple was calculated and
used in order to add a software offset such that the thermocouples showed 23.6 °C.
Figure 4.4.1a shows the thermocouples before calibration and Figure 4.4.1b shows the
thermocouples after calibration.
(a) (b)
Figure 4.4.1: Measured temperature by the thermocouples. (a) The measuredambient temperature before calibration. (b) Calibrated thermocouples to surroundingtemperature (23.6 °C).
It was also important that the sensors could respond to variations in temperature.
This was tested with the thermocouples calibrated. In order to validate that
the thermocouples responded to temperature changes, they were placed in room
temperature, 23.6 °C, and then placed between two fingertips to later be released. It
can be seen in Figure 4.4.2 that the sensors responded to the temperature changes.
37
CHAPTER 4. IMPLEMENTATION
Figure 4.4.2: Calibrated thermocouples. The response from each thermocouple astemperature is varying.
In order for the wires of the thermocouples to not interfere with the movement of
the MY700, they had to be routed along the high current cables to the cassette holder.
The wires of the thermocouples only have a thin layer of insulation, which put them
at risk of being affected by electromagnetic interference (EMI). It was observed that
when the thermocouplesweremounted on the ejector in theMY700 the signals became
more noisy, as seen in Figure 4.4.3a. This was likely due to the EMI from the cables of
the machine. A low pass filter was applied to the measured signal to compensate for
this. Since the temperature is expected to change relatively slowly, a cutoff frequency
of fcutoff = 3Hz was chosen. The filtered signal is shown in Figure 4.4.3b.
(a) (b)
Figure 4.4.3: Measured temperature by the thermocouples during an RT1 job. (a) Themeasured temperature before filtering. (b) The measured temperature after filtering.
38
CHAPTER 4. IMPLEMENTATION
4.4.2 Shunt resistor
The positioning of the shunt resistor is shown in the simplified schematic in Figure
4.4.4. The Red Pitaya measures the voltage drop across the resistance to calculate the
current using Equation 2.2.
Figure 4.4.4: A schematic of the positioning of the shunt resistor that is on the lowside of the piezo. The resulting resistance is R = 0.05 Ω by having two shunt resistors,Rs = 0.1 Ω, in parallel [4].
The first part of verifying and validating the shunt resistor was to verify a similar
behaviour of the current of the shots as in [4]. This was achieved and the current
curve can be seen in Figure 4.4.5. This figure shows the mean value and the standard
deviation of all the shots from a BGA job.
Figure 4.4.5: Measured piezo current when jet printing a BGA job with the diameterof the deposits being 380 µm. The dark blue curve shows the mean current of all shotsand the shaded area around the curve illustrates the standard deviation.
However, it was also of interest to evaluate potential noise from the circuit, which
was done by connecting the probe and its ground to the high side of the shunt resistor.
This resulted in values being different from zero as different BGA jobs were performed,
as seen in Figure 4.4.6a, indicating that there was noise affecting the measurement.
39
CHAPTER 4. IMPLEMENTATION
The diameter range in which the ejector deposited during the RT1 test was between
330 µm and 520 µm. Figure 4.4.6a indicates a difference in noise depending on the
diameter of the deposited solder paste. Furthermore, the RT1 operated between 160
Hz and 300 Hz and the noise from the extreme values shows almost identical curves
in Figure 4.4.6b, showing that the frequency had a small effect on the noise.
(a) (b)
Figure 4.4.6: Measured noise from the current sensing with different BGA job settings.(a) Diameters: 330 µm, 380 µm, 429 µm, 520 µm. Frequency constant at 300 Hz. (b)Frequencies: 160 Hz, 300 Hz. Diameter constant at 330 µm.
It was suggested to subtract the measured noise from the measured current. This
resulted in Figure 4.4.7, which has a similar shape as the waveform in Figure 1.2.2.
Also, Figure 4.4.7 shows a rise time of 34 µs and a plateau time of 50 µs which
corresponded to the parameter settings for the job, indicating the waveform is correct.
Figure 4.4.7: Calibrated measurement of current in a BGA job with the diameters ofthe deposits being 380 µm. The dark blue curve shows the mean current of all shotsand the shaded area around the curve illustrates the standard deviation.
40
CHAPTER 4. IMPLEMENTATION
4.4.3 Prediction Model
The developed LSTM model is seen in Table 4.2.1 and, to verify and validate its
architecture and parameter settings, it was given the time series data set in [53] and the
prediction should conform with the predictions made in [53]. The data set is used for
airline passenger prediction and this project’s LSTMmodel gave similar predictions as
the one in [53], which verifies that the model can make time series predictions. Figure
4.4.8a shows the predictions by the LSTMmodel in [53], while Figure 4.4.8b shows the
predictions made by the LSTM model from this project. The model designed for this
project conforms with the model in [53], thus, it is deemed to be verified and validated
for its purpose.
(a) (b)
Figure 4.4.8: Verification and validation of the LSTMmodel developed in this projectby comparing its predictions with the LSTM model in [53]. (a) The training and testpredictions of the LSTM model in [53]. (b) The training and test predictions of thisproject’s LSTMmodel.
41
CHAPTER 4. IMPLEMENTATION
42
Chapter 5
Results
In this chapter, the results from the test cases are shown. Section 5.1 gives the results
of the performance of the LSTMmodel using the BGA job, while Section 5.2 shows the
results while using the RT1 job. Loss curves, which have been described in Chapter
2, are presented. Graphs which compare the true and predicted diameter, as well as
graphs showing the distribution of the predictions, are also presented. The metrics
used for evaluating the diameter prediction of the LSTM model are MAE, MRE and
also the number of predictions where the relative error of the predicted diameter was
greater than 8% (Bad predictions). These metrics are presented in tables. Section 5.3
addresses to what extent the requirements were fulfilled in this project.
5.1 BGA Results
The training process of the LSTM model using the BGA job is presented in Figure
5.1.1, which shows the training and validation loss when the LSTM model was given
all sensor data. The loss of the other sensor combinations from Table 5.1.1 are shown
in Appendix B.
43
CHAPTER 5. RESULTS
Figure 5.1.1: Training and validation loss when training the LSTMmodel using a BGAjob. Trained for 25 epochs. Input parameters are the temperature from the threethermocouples and the current.
Table 5.1.1 contains the BGA test results with different combinations of input
parameters as shown in Table 4.3.2. Three runs were performed for each combination
using the test data, and the average values from the runs are presented in the table. The
input parameters are, asmentioned earlier, the three temperature sensors (T1, T2, T3),
whose placements are defined in Figure 1.2.1, and the current (I). Results when only
temperature data and no current data is used is shown in Appendix C.
Table 5.1.1: BGA test results for different sensor combinations.
Sensor Combination
I I, T1 I, T2 I, T3 I, T1, T2,
T3
MAE [µm] 5.81 5.82 5.77 5.82 5.82
MRE [%] 1.52 1.52 1.51 1.52 1.52
Bad predictions 71 94 61 77 80
In addition to the table above, showing the performance of the LSTM model on
unseen test data, graphs of the predicted and true diameter of each shot are shown
below. These graphs are shown in Figure 5.1.2, Figure 5.1.3 and Figure 5.1.4, where
each figure has two subfigures with different window scales: one which shows all test
data and one which shows a smaller region chosen arbitrarily. Figure 5.1.2 only has
current data as input, Figure 5.1.3 has input data from the three temperature positions
44
CHAPTER 5. RESULTS
individually, together with supporting data from the current sensor, and Figure 5.1.4
uses all sensor data as input.
(a) (b)
Figure 5.1.2: Predicted and true diameter using a BGA job with I as input. (a) Thecomplete run of the test data. (b) Diameters of 250 shots between shot 14,000 and14,250.
(a) (b)
Figure 5.1.3: Predicted and true diameter using a BGA job with (T1, I), (T2, I) and (T3,I) as input. (a) The complete run of the test data. (b) Diameters of 250 shots betweenshot 14,000 and 14,250.
Distribution graphs of the predicted and true diameter of each run are shown in
Figure 5.1.5. In Figure 5.1.5b, the blue and red curves are hard to distinguish since
they directly overlap. The green curve protrudes at the top, but the majority of its body
overlaps with the other curves.
45
CHAPTER 5. RESULTS
(a) (b)
Figure 5.1.4: Predicted and true diameter using a BGA job with T1, T2, T3 and I asinput. (a) The complete run of the test data. (b) Diameters of 250 shots between shot14,000 and 14,250.
(a) (b)
(c)
Figure 5.1.5: Distribution of predicted diameters and distribution of true diametersusing a BGA job. (a) Input variable is I. (b) Input variables are I and each of thetemperature sensors. (c) Input variables are I, T1, T2 and T3.
46
CHAPTER 5. RESULTS
5.2 RT1 Results
The training process of the LSTMmodel using the RT1 job is presented in Figure 5.2.1,
which shows the loss of the training and validation.
Figure 5.2.1: Training and validation loss when training the LSTMmodel using an RT1job. Trained for 35 epochs.
Table 5.2.1 contains the RT1 test results with the combined sensor data as input.
Three runs were performed for this combination using the test data and the average
values from the runs are presented in the table.
Table 5.2.1: RT1 test results for selected sensor combination.
Sensor Combination
I, T1, T2, T3
MAE [µm] 6.22
MRE [%] 1.50
Bad predictions 132
Figure 5.2.2 shows the predicted and true diameter with two different window
scales with a sensor combination of I, T1, T2 and T3 as presented in Table 5.2.1. In
Figure 5.2.2a, each plateau contains shots deposited at 160, 230 and 300 Hz. Figure
5.2.3 shows the distribution of the predictions. Appendix C shows the graphs of the
performance of the LSTMmodel when only given the temperature data as input.
47
CHAPTER 5. RESULTS
(a) (b)
Figure 5.2.2: Predicted and true diameter using an RT1 job with I, T1, T2 and T3 asinput. (a) The complete run of the test data. (b) Diameters of 250 shots between shot8,000 and 8,250.
Figure 5.2.3: Distribution of predicted diameters made by the LSTM and the truediameters distribution using an RT1 job. The input configuration is I, T1, T2, and T3.
5.3 Fulfillment of Requirements
This section relates back to the requirements and answers to what extent they were
fulfilled based on the results from the cases.
• A neural network shall be trained to predict changes in the quality of jetting
deposits which later can be used for real-time prediction.
The neural network identified trends of the diameter, but variations between shots
could not be identified accurately by the neural network. It was more accurate
48
CHAPTER 5. RESULTS
whenever themeasured diameter followed a repeating pattern thanwhen it wasmostly
random. The neural network was trained and evaluated off-line to tune its weights so
that the model could get real-time sensor input to predict quality in terms of diameter.
• Temperature and current shall be measured and the data shall be used as input
to the neural network.
Both temperature and current were measured while performing the BGA and the RT1
job. The data was given to the neural network in order to make predictions about
the quality of the deposits. Improvements from previous work were made to both
the temperature and the current measurements to assure as accurate input data as
possible.
• Three different locations on the ejector shall be used for temperature
measurements.
The same configuration was used as in the previous year’s master’s thesis: a sensor at
the end of the Archimedes screw, at the solder paste container in the ejector and at the
chamber next to the piston. These positions are shown in Figure 1.2.1.
• The quality of the solder paste shall be based on the diameter of the shots and
these shall be measured using a MY700 jet printer.
The quality of the deposits, for this project, was based on their diameter. The neural
network predicted the diameter of the deposits based on its two types of sensor data
input. The tests were performed using the MY700, which is able to take quality
measurements of the deposits.
• Acceptable results from the neural network require the predicted diameter to
vary less than 8% from the actual diameter for individual predictions.
Not all of the predictions were within the acceptable range of 8% from the actual
diameter, but the majority were. For the BGA job, 80 out of 50,000 predictions were
non-acceptable when all available sensor data was used, and for the RT1 job, 132 out
of 50,000 predictions were non-acceptable.
49
CHAPTER 5. RESULTS
50
Chapter 6
Discussion
In this chapter, the findings from the case study are discussed, including the BGA and
RT1 results, the sensor implementation, the requirements and the research method.
6.1 Test Cases
The results from the different test cases are shown in Chapter 5. The performance of
the neural network for these test cases is discussed below.
6.1.1 BGA
The first job that was examined was the BGA job, since this was considered to be the
simpler case. The neural network was first trained using only the three temperature
sensors as input to get an idea if the thermocouples alone would be enough to make
accurate diameter predictions. Since the current is not used in this case, it does
not relate directly to the research question. However, it does help to improve the
understanding of the effects of the current. The predictions for this case can be seen in
Figure C.0.1 in Appendix C. From the graphs, it becomes clear that the neural network
is not able to accurately predict the diameter of individual shots, but rather find an
average value for the diameter which the predictions are always close to. This was
expected considering that the temperature had been filtered with a low pass filter,
meaning that large differences in temperature between consecutive shots had been
filtered out. This was done to compensate for the noise, as mentioned in Section 4.4.1.
However, in the test data there are two clear deviations of themeasureddiameterwhere
the diameter deviates downwards, which can be seen in Figure C.0.1a. For those shots,
the predicted diameter reacts by oscillatingmore, which is an indication that the neural
network is able to notice some sort of deviation. While this configuration of sensors
was able to find a good average value for the BGA job, it did not work as well for the
51
CHAPTER 6. DISCUSSION
RT1 job where the diameter varied, as seen in Appendix C. This is because it was not
able to anticipate the diameter changes from the temperature alone.
Next, the neural network was trained using only the current as input. Figure 5.1.2
shows the resulting predictions from this training. The graphs show that the neural
network no longer only predicts values close to the mean diameter, indicating the
current gives more information about individual shots. This is also confirmed by
looking at the distribution graph in Figure 5.1.5a, where a larger distribution for the
predictions can be seen than in Figure C.0.1b in Appendix C. It is also still possible to
find the two locations where the diameter deviates downwards. The average MAE and
MRE for this configuration were 5.81 μm and 1.52%, respectively, which can be seen
in Table 5.1.1. One of the requirements for this project was that the predicted diameter
should vary less than 8% from the actual diameter. The number of predictions where
the relative error was larger than 8% for this configuration was 71 out of 50,000 shots.
The next three cases were used to compare the viability of the different temperature
sensor locations. Each of the temperature sensors was tested together with the current.
The results from these cases can be seen inTable 5.1.1, and the numbers show that there
is no significant difference between the different locations. Using the current together
with Sensor 2 gave the least amount of bad predictions. However, the amount of bad
predictions could vary by 40 shots from run to run. This means that the differences
were likely not significant, which is also supported by theMAEandMREnumbers. The
numbers for these three cases are also similar to the case which only had the current as
input, suggesting that the current is responsible for the majority of the performance.
Figure 5.1.3 shows the predictions for all three cases, and it shows that the three cases
performed similarly. The distributions can be seen in Figure 5.1.5b, and they are also
similar for the three cases.
The final configuration that was tested with the BGA job was with all sensors.
Theoretically this should lead to the best performance since it has the most data
available, but the error was approximately the same as the previous cases, as shown
in Table 5.1.1. Figures 5.1.4 and 5.1.5c also show that the performance is similar. The
distribution seems to be slightly better when all sensors were used than when only the
current was used, but the difference is not significant and it could be due to results
varying slightly between runs. The loss curves for this configuration can be seen in
Figure 5.1.1. From this graph it can be seen that the training loss decreases while the
validation loss does not. This gives an indication that the neural network is struggling
to learn, that is, overfitting to the training data.
52
CHAPTER 6. DISCUSSION
6.1.2 RT1
The input parameters to the neural network while using the RT1 job was decided to
be both types of sensor data, since the temperature could find slower trends while the
current was better for single dot prediction to some extent. The purpose of this case
was to validate the robustness of the neural network by varying the diameter of the
deposits as well as the operating frequency. By comparing Table 5.2.1 and Table 5.1.1
it can be seen that the numbers are roughly the same, but with a slightly larger number
of bad predictions and larger MAE for the RT1 job. Training the neural network on a
more complicated pattern did not significantly decrease the quality of the predictions.
Figure 5.2.3 shows that there is a bigger spread of the predictions for the larger
deposited diameters. This does not tell us anything about the accuracy of the
predictions, but rather the ability to identify variations in the ejected diameters. From
Figure 5.2.2b and Figure 5.2.3 it can be seen that the neural network underestimates
the true diameter. The distribution of the predictions in Figure 5.2.3 is shifted slightly
to the lower range of diameters, which is exemplified in Figure 5.2.2b where the
majority of the predictions are in the lower range. This is clearly evident for the larger
deposits, while not so much for the smaller deposits.
Figure 5.2.1 shows that most of the learning happened in the earlier epochs, with
only a slight decrease in loss in the remaining epochs. A larger number of epochs
could potentially lead to overfitting, as for the BGA test, which would make the neural
network poor at generalizing on unseen data. However, one can distinguish a low
degree of underfitting, which causes the neural network to make poor decisions about
the underlying structure of the data. In other words, it makes assumptions about the
data rather than finding relationships between the inputs and outputs. This could be
an important finding since the quality of the predictions varies between the different
ejected diameters. That could possibly be explained by the underfitting.
6.1.3 Neural Network Performance
Since the measured diameters of most shots in both job types were concentrated
around the expected value, the neural network could base its predictions in that range
without being heavily penalized. Trying to predict the anomalies and larger deviations
will come at a risk of increasing the error if the prediction is not accurate. In other
words, the model needs to be able to see clear correlations in the data in order to
accurately predict larger deviations. From the graphs in Chapter 5, it could thus be said
that the data the LSTMmodel is working with does not have clear enough correlations
between input data and the measured diameter. This could be due to factors, such as
noisy input data or that the temperature and current are not sufficient on their own.
53
CHAPTER 6. DISCUSSION
6.2 Sensors
As mentioned earlier, the thermocouples were affected by noise. Since the sensor
cables were mounted along the high current cables going to the piezo, the most
probable reason for the measurement noise is EMI. The low pass filter reduced this,
but it was difficult to design the filter such that only the noise would be filtered out and
that the true temperature signal would not be affected. It should also be noted that
the measurement from the three different locations of the temperature sensors gave
approximately the same temperature, as seen in Figure 4.4.3b. This could possibly
be explained in two ways: the aluminum body of the ejector diffused the generated
heat efficiently and adopted the same temperature, or not enough additional heat
was emitted from the locations to be registered by the thermocouples. Furthermore,
the possible heat generated at the three sensor locations, see Figure 1.2.1, most likely
introduced a lag since the heat had to cross the aluminumbody before it wasmeasured.
This could have impacted the quality of the prediction, especially when only having the
temperature data as input.
The temperature seemed to be fairly constant throughout the jobs, where the overall
trend showed that the temperature increased slightly as the jobwent on. Themachine’s
ejector temperature was set to 29 °C, and Figure 4.4.3b shows that the measured
temperature from the thermocouples was close to that target. In last year’s master’s
thesis, the temperature varied more when the ambient temperature was 18 °C [4].
When it was 24 °C, it was mostly constant. The MY700 used in this project was
standing in a room where the ambient temperature was 23-24 °C, so the fact that the
temperature was fairly constant during a job coincides with the previous results in [4].
If the temperature in the room was lower, meaning that the temperature would likely
varymore, it is possible that the gathered temperature data would have a larger impact
on the performance.
When removing the noise from the current curve, the average noise curve was
measured and saved for each diameter. This was because the graphs in Figure 4.4.6
showed that it was the diameter that affected the noise and not the frequency. When
subtracting the noise from the measurement, the correct noise curve was chosen
depending on what the diameter of the shot was. A more accurate way of removing
the noise would have been to measure the noise for each individual shot, and this
would also improve the internal validity of this project. This was not feasible in this
project however, since it would double the amount of data that needed to be stored
and processed. High memory usage was already an issue in this project, and therefore
this method was not used.
54
CHAPTER 6. DISCUSSION
6.3 Requirements
The requirements of this project were all fulfilled, except the last one. This requirement
was considered non-fulfilled since the predicted diameter did not follow the true
diameter confidently so that individual anomalies could be detected. This resulted
in some predictions being outside the 8% range. In the BGA job, the LSTM model
seemed able to find deviating trends, but in the RT1 job the model was struggling to
find deviating trends in the quality. That is exemplified by the fourth plateau in Figure
5.2.2a, which in its second part has a deviating trend towards smaller diameters that
the model could not predict. A reason for making it difficult for the LSTM model to
predict the quality could be the random variations in the diameter. It is possible that
the amplitude of the variations could have increased slightly as a consequence of raising
the nozzle from 650 µm to 800 µm above the surface due to the protruding sensor.
That difference in height will have an impact on the quality of the deposits since the
time of flight between the nozzle and the surface increases, causing the diameter of the
droplets to vary more.
6.4 Research Method
The use of a case study was appropriate for this project since it enabled knowledge to
be obtained regarding what affects the quality of jetting deposits. Using two job types,
BGA and RT1, meant that a clearer evaluation of the neural network could be done.
That also elevated the internal validity in terms of reliable results. It is exemplified
by the performance of the neural network with only the temperature as input as seen
in Appendix C. The results from the BGA job indicate capabilities of finding the mean
value of the job sequence, but the results from the RT1 job discard this hypothesis
since the job is composed of five unique sequences in terms of diameter. Thus, any
distinct differences in the diameter in the job sequence will cause the temperature data
to be insufficient for making predictions. The robustness could also be tested when the
diameter varied in the RT1 job. Asmentioned above in Section 6.2, the internal validity
of the project was impacted by noisy measurements. Measures were taken to reduce
the effects of the noise as much as possible, but as mentioned in Chapter 8, it can be
looked into further in future work. The external validity could also be improved in
future projects, and this is also discussed in Chapter 8.
55
CHAPTER 6. DISCUSSION
56
Chapter 7
Conclusions
The purpose of this project, as reflected in the research question, was to evaluate the
usefulness of three temperature sensors’ positions in regard to increasing the accuracy
of a neural network used for predicting jetted solder paste quality. As mentioned in
Chapter 1, the research question is:
In a piezo-based material depositing device, what are the implications
of the predetermined temperature sensor positions, when providing
supporting data from a current sensor, in regard to increasing the
accuracy of predicting jetted solder paste quality by training a neural
network?
The case study performed in this project has shown that none of the temperature
sensors significantly improved the performance of the neural network, and there were
no considerable differences between the three sensors. However, the temperature data
seemed to be able to help the neural network recognize slower trends of the diameter
that lasted over several shots, even though it did not have a large impact on the
accuracy, while the current was more useful for individual shot prediction. The same
behaviour was noticed for both types of jobs used. This summarizes the hypothesis
which has been created in this project, while simultaneously answering the research
question.
All the requirements of the project were fulfilled except for the last one which
concerned the accuracy of the predictions. The main reasons for this are thought to
be that the data used did not contain enough information to make predictions at that
level of accuracy, and also that noise affected the quality of the measurements. The
degree of randomness of the diameter was also higher than expected, making the task
more difficult. Possible future work is discussed in Chapter 8 below, and includes
investigating other types of sensor data, reducing the amount of noise and investigating
other measurements of quality. The project has been conducted with future use in
57
CHAPTER 7. CONCLUSIONS
mind, and the neural network can be adapted to be used on different machines and
with different input and output data. With some further improvements, the neural
network could be utilized for making accurate real-time predictions, which would be a
benefit in the jet printing process.
58
Chapter 8
Future Work
This master’s thesis explored the possibility of collecting two different types of sensor
data from components of the MY700 jet printing machine and using a neural network
to process that data in order to make predictions about the quality of the deposits. The
project was the first in combining these two elements and examining its usefulness
in this area of engineering at Mycronic. It has been concluded that there are two
alternative ways to proceed with this project in the future, either more research on
improving the neural network or investigating other variables to be measured that
could have more impact on the quality. There are also interesting possibilities for
future work in the long term, once the neural network has been improved. This
includes looking into creating an interface which in real time informs the user of the
predicted quality, and also creating a control loop which adjusts the jetting parameters
in real time.
Further improvements of the software should include evaluating different types of
neural network architectures on existing data. By comparing different architectures,
one can more easily make conclusions regarding the importance of the sensor data
for quality predictions, as well as the most suitable type of neural network. In this
direction of future improvements, investigating othermeasurements of quality, that is,
shape of deposits, number of satellites, etc., could be of interest. Since the mounting
of Sensor 1, see Figure 1.2.1, required raising the nozzle about 150 µm, a consequence
wasworsened deposit quality. Thus, any future evaluation of different neural networks
should consider redesigning the mounting of Sensor 1 so the nozzle is in no need of
being raised.
The other option for improving this project would bemore focused on the hardware
configuration. This includes performing tests on different machines and ejectors,
which would improve the external validity, but also investigating other types of sensor
data. However, evaluating on different machines and ejectors should be the secondary
option, while investigating other sensor data should be the primary option, since it is
59
CHAPTER 8. FUTUREWORK
desired to first improve the performance of the neural network. One suggestion for
sensor data that would be of interest is the actual voltage level to the piezo as seen in
Figure 2.1.2. That figure shows the desired voltage at the piezo from the parameter
settings and currently there is no feedback of the actual voltage level. Since the voltage
level is noisy, some effort would also have to be made to effectively filter the signal. It
would also be of interest to look into different ways of shielding the thermocouples
from EMI from the machine. This would reduce the noise and probably make the
temperature data easier to correlate to the quality of the deposits, while also improving
the internal validity.
60
References
[1] N. Coenen. Industry trends are boosting Jet Printing. Mycronic AB. 2015. URL:
https://www.smta.org/chapters/files/SMTA- Capital- Chapter- 2015-Industry-trends-boosting-Jet-Printing.pdf (visited on 11/18/2019).
[2] Mycronic AB. URL: https://www.mycronic.com/en/about-mycronic/ (visited
on 01/20/2020).
[3] E. Kolibacz. “Classification of incorrectly picked components
using Convolutional Neural Networks”. Master’s Thesis. KTH Royal Institute
of Technology, 2018.
[4] B. Björnsdóttir. “Feedback strategies to decrease droplet variability in drop-on-
demand deposition of complex fluids”. Master’s Thesis. KTH Royal Institute of
Technology, 2019.
[5] M. D. Baker, C. D. Himmel, and G. S. May. “Time series modeling of reactive
ion etching using neural networks”. In: IEEE Transactions on Semiconductor
Manufacturing 8.1 (Feb. 1995), pp. 62–71. DOI: 10.1109/66.350758.
[6] B. Zhang and G. S. May. “Towards real time fault identification in plasma
etching using neural networks”. In: IEEE/SEMI 1998 IEEE/SEMI Advanced
Semiconductor Manufacturing Conference andWorkshop. Sept. 1998, pp. 61–
65. DOI: 10.1109/ASMC.1998.731394.
[7] C. J. Spanos, H. F. Guo, A. Miller, and J. Levine-Parrill. “Real-time statistical
process control using tool data (semiconductor manufacturing)”. In: IEEE
Transactions on Semiconductor Manufacturing 5.4 (Nov. 1992), pp. 308–318.
DOI: 10.1109/66.175363.
[8] S. Mallik, M. Schmidt, R. Bauer, and N. N. Ekere. “Influence of solder paste
components on rheological behaviour”. In: 2008 2nd Electronics System-
Integration Technology Conference. Sept. 2008, pp. 1135–1140. DOI: 10.1109/ESTC.2008.4684512.
61
REFERENCES
[9] A. E. Marks, S. Mallik, N. N. Ekere, and A. Seman. “Effect of temperature on
slumping behaviour of lead-free solder paste and its rheological simulation”. In:
2008 2nd Electronics System-Integration Technology Conference. Sept. 2008,
pp. 829–832. DOI: 10.1109/ESTC.2008.4684459.
[10] J. Leal, G. Mårtensson, and N. Augustis. Solder Paste Jetting: An Integral
Approach. Mycronic AB. Nov. 2018. URL: http : / / smt . iconnect007 . com /index . php / article / 113955 / solder - paste - jetting - an - integral -approach/113958/?skin=smt (visited on 01/16/2020).
[11] S. X. Fu. “Finding Optimal Jetting Waveform Parameters with Bayesian
Optimization”. Master’s Thesis. KTH Royal Institute of Technology, 2018.
[12] APC International Ltd. Piezo Theory. 2016. URL: https : / / www .americanpiezo . com / knowledge - center / piezo - theory . html (visited on
01/17/2020).
[13] H. C. Liaw, B. Shirinzadeh, and J. Smith. “Sliding-Mode Enhanced Adaptive
Motion Tracking Control of Piezoelectric Actuation Systems for Micro/Nano
Manipulation”. In: IEEE Transactions on Control Systems Technology 16.4
(July 2008), pp. 826–833. ISSN: 2374-0159. DOI: 10.1109/TCST.2007.916301.
[14] Y. Ham, B. An,M. A. Trimzi, G. Lee, J. Park, and S. Yun. “An experimental study
on the displacement amplification mechanism driven by piezoelectric actuators
for jet dispenser”. In: 2016 International Conference on Manipulation,
Automation and Robotics at Small Scales (MARSS). July 2016, pp. 1–5. DOI:
10.1109/MARSS.2016.7561742.
[15] D. Collins. FAQ: What are stacked piezo actuators and what do they do? Nov.
2015. URL: https://www.motioncontroltips.com/faq-what-are-stacked-piezo-actuators-and-what-do-they-do/ (visited on 01/18/2020).
[16] J. Park and W. Moon. “Hysteresis compensation of piezoelectric actuators: The
modified Rayleigh model”. In: Ultrasonics 50.3 (2010), pp. 335–339. ISSN:
0041-624X. DOI: https://doi.org/10.1016/j.ultras.2009.10.012. URL:http://www.sciencedirect.com/science/article/pii/S0041624X09001498.
[17] APC International Ltd. Stripe Actuators. 2016. URL: https : / / www .americanpiezo.com/standard- products/stripe- actuators.html (visited
on 01/18/2020).
[18] J. Vinnars and J. Vinnars. “Correlations Between Rheological Properties and
Jetting Results in Solder Paste Jetting”. Master’s Thesis. Uppsala Universitet,
June 2017.
62
REFERENCES
[19] D. E. Alexander. “Chapter 4 -BiologicalMaterials BlurBoundaries”. In:Nature’s
Machines. Academic Press, 2017, pp. 111–114. ISBN: 978-0-12-804404-9.
[20] R. P. Chhabra and J. F. Richardson. “Chapter 1 - Non-Newtonian Fluid
Behaviour”. In:Non-Newtonian Flow and Applied Rheology (Second Edition).
Second Edition. Oxford: Butterworth-Heinemann, 2008, pp. 1–55. ISBN: 978-
0-7506-8532-0.
[21] N. Chandran, S. Chandran, and S. Thomas. “Chapter 1 - Introduction to
rheology”. In: Rheology of Polymer Blends and Nanocomposites. Micro and
Nano Technologies. Elsevier, 2020, pp. 1–17. ISBN: 978-0-12-816957-5.
[22] M. Judd and K. Brindley. “6 - Solder paste”. In: Soldering in Electronics
Assembly (Second Edition). Second Edition. Oxford: Newnes, 1999, pp. 109–
126. ISBN: 978-0-7506-3545-5.
[23] M. M. Schwartz. Soldering: Understanding the Basics. ASM International,
2014. ISBN: 9781627080583.
[24] E. Landman. “Viscosity control of solder paste by ultrasound actuation”.
Master’s Thesis. KTH Royal Institute of Technology, 2018.
[25] D. Ibrahim. “Chapter 3 - Thermocouple Temperature Sensors”. In:
Microcontroller Based TemperatureMonitoring andControl. Oxford: Newnes,
2002, pp. 63–85. ISBN: 978-0-7506-5556-9.
[26] P. R. N. Childs. “5 - Thermocouples”. In: Practical Temperature Measurement.
Oxford: Butterworth-Heinemann, 2001, pp. 98–144. ISBN: 978-0-7506-5080-
9.
[27] National Instrument. Current Measurements: How-To Guide. Oct. 2019. URL:
http://www.ni.com/tutorial/7114/en/ (visited on 04/28/2020).
[28] N. Patin. “1 - Sensors for Power Electronics”. In: Power Electronics Applied
to Industrial Systems and Transports. Elsevier, 2016, pp. 1–73. ISBN: 978-1-
78548-033-1.
[29] M. Ossmann.Red PitayaNot just a USB scopemodule. Nov. 2014. URL: https:/ / www . elektormagazine . com / assets / upload / files / EN2014120381 . pdf(visited on 01/19/2020).
[30] H. Baggen. Review: The new Red Pitaya line. Nov. 2014. URL: https://www.elektormagazine.com/news/review-the-new-red-pitaya-line (visited on
01/19/2020).
[31] Digi-Key Electronics. Red Pitaya STEMlab Device. July 2018. URL: https://www.digikey.ro/en/ptm/t/trenz/red-pitaya-stemlab-device/tutorial(visited on 01/19/2020).
63
REFERENCES
[32] R. Singh. FPGA Vs ASIC: Differences Between Them And Which One To Use?
July 2018. URL: https://numato.com/blog/differences- between- fpga-and-asics/ (visited on 01/19/2020).
[33] D. R. Baughman and Y. A. Liu. “1 - Introduction toNeural Networks”. In:Neural
Networks in Bioprocessing and Chemical Engineering. Boston: Academic
Press, 1995, pp. 1–20. ISBN: 978-0-12-083030-5. DOI: https : / / doi . org /10.1016/B978-0-12-083030-5.50007-2. URL: http://www.sciencedirect.com/science/article/pii/B9780120830305500072.
[34] X. Yang. “8 - Neural networks and deep learning”. In: Introduction to
Algorithms for Data Mining and Machine Learning. Academic Press, 2019,
pp. 139–161. ISBN: 978-0-12-817216-2. DOI: https://doi.org/10.1016/B978-0-12-817216-2.00015-6. URL: http://www.sciencedirect.com/science/article/pii/B9780128172162000156.
[35] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. http : / / www .deeplearningbook.org. MIT Press, 2016.
[36] N. Donges.Recurrent neural networks 101: Understanding the basics of RNNs
and LSTM. June 2019. URL: https : / / builtin . com / data - science /recurrent-neural-networks-and-lstm (visited on 01/23/2020).
[37] J. McGonagle, C. Williams, and J. Khim. Recurrent Neural Network. URL:
https : / / brilliant . org / wiki / recurrent - neural - network/ (visited on
01/23/2020).
[38] C. Olah. Understanding LSTM Networks. Aug. 2015. URL: http : / / colah .github.io/posts/2015-08-Understanding-LSTMs/ (visited on 01/26/2020).
[39] The MathWorks Inc. Long Short-Term Memory Networks. URL: https://se.mathworks.com/help/deeplearning/ug/long-short-term-memory-networks.html;jsessionid=2913b326abc1143e5efc7917a044 (visited on 01/26/2020).
[40] J. Brownlee. Long short-term memory networks with Python: develop
sequence prediction models with deep learning. v1.0. 2017.
[41] S. Rajasekar, P. Philominathan, and V. Chinnathambi. Research Methodology.
2006. arXiv: physics/0601009 [physics.gen-ph]. URL: https://arxiv.org/pdf/physics/0601009.pdf (visited on 04/24/2020).
[42] A. Håkansson. “Portal of Research Methods and Methodologies for Research
Projects and Degree Projects”. In: Proceedings of the International Conference
on Frontiers in Education : Computer Science and Computer Engineering
FECS’13. CSREA Press U.S.A, 2013, pp. 67–73.
64
REFERENCES
[43] D. Ary, L. Cheser Jacobs, C. Sorensen, and A. Razavieh. Introduction to
Research in Education. 8th edition. Wadsworh, Cengage Learning, 2010.
[44] D. Muijs. Doing quantitative research in education with SPSS. Sage, 2010.
[45] D. E. Perry. Case Studies. 2004. URL: http://users.ece.utexas.edu/~perry/education/382c/L06.pdf (visited on 04/24/2020).
[46] M. Shuttleworth. Case Study Research Design. Apr. 2008. URL: https : / /explorable.com/case-study-research-design (visited on 01/28/2020).
[47] The Open University. Case Studies and Experiments. pp. 63-70. 2013. URL:
https : / / www . open . edu / openlearncreate / pluginfile . php / 50733 / mod _oucontent/oucontent/550/none/none/deh313_1blk2.12.pdf? (visited on
04/25/2020).
[48] R. B. Johnson. “Examining the validity structure of qualitative research”. In:
Education 118.2 (1997), p. 282.
[49] D. T. Campbell and J. C. Stanley. “Experimental and Quasi-Experimental
Designs for Research”. In: Handbook of Research on Teaching. Houghton
Mifflin Company, 1963, pp. 5–22. ISBN: 0-395-30787-2.
[50] Physitemp - Precision Temperature Specialists. 2019. URL: https : / /physitemp.com/ (visited on 05/27/2020).
[51] ITS-90 Table for type T Thermocouple (Ref Junction 0°C). Reotemp
Instruments. Nov. 2014. URL: https : / / www . thermocoupleinfo . com / pdf /type-t-thermocouple-reference-table.pdf (visited on 02/13/2020).
[52] Analog Devices Inc. AD594/AD595. Monolithic Thermocouple Amplifiers with
Cold Junction Compensation. URL: https://www.sparkfun.com/datasheets/IC/AD595.pdf (visited on 02/12/2020).
[53] J. Brownlee. Time Series Prediction with LSTM Recurrent Neural Networks
in Python with Keras. Aug. 2019. URL: https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/.
65
Appendix A
PCB Schematic Overview
A schematic overview of the PCB can be seen in Figure A.0.1 below.
Figure A.0.1: PCB Schematic Overview.
66
Appendix B
Training and Validation Loss
Training and validation loss of the LSTMmodel, while using the BGA, job is shown in
Figure B.0.1. The input parameter configurations are: (T1, T2, T3), (I, T1), (I, T2) and
(I, T3).
(a) (b)
(c) (d)
Figure B.0.1: Training and validation loss for different input parameter configurationsusing a BGA job. (a) Input parameters: T1, T2, T3. (b) Input parameters: I, T1. (c)Input parameters: I, T2. (d) Input parameters: I, T3.
67
Appendix C
Performance Without Current
TableC.0.1 contains the results from training theneural networkusing the temperature
data from the three sensors without the piezo current data. The results while using
both the BGAandRT1 job are shown. Three runswere performed for each combination
using the test data, and the average values from the runs are presented in the table. The
metrics used for evaluating the diameter prediction of the LSTMmodel areMAE,MRE
and also the number of predictions where the relative error of the predicted diameter
was greater than 8% (Bad predictions).
Table C.0.1: Test results when only using temperature data.
Job Type
BGA RT1
MAE [µm] 5.77 57.24
MRE [%] 1.51 14.07
Bad predictions 197 32,870
Performance of the LSTM model using the BGA job when only given the
temperature data as input is shown in Figure C.0.1, where Figure C.0.1a shows the
predicted and true diameter and Figure C.0.1b shows the predicted and true diameter
distribution.
68
APPENDIX C. PERFORMANCEWITHOUT CURRENT
(a)(b)
Figure C.0.1: Results from the predictions by the LSTM model when only given thetemperature data as input when using a BGA job. (a) The predicted and true diameter.(b) Distribution of the predicted and true diameters.
The corresponding graphs for the RT1 job are shown in Figure C.0.2, where Figure
C.0.2a shows the predicted and true diameter and Figure C.0.2b shows the predicted
and true diameter distribution.
(a)(b)
Figure C.0.2: Results from the predictions by the LSTM model when only given thetemperature data as input when using an RT1 job. (a) The predicted and true diameter.(b) Distribution of the predicted and true diameters.
69
www.kth.se