View
827
Download
2
Category
Preview:
Citation preview
Uladzimir Kalashnikau
Performance challenges and
victories we got at open source
ecommerce
How EPAM was engaged by Magento
History:
Magento was looking to outsource some product development work to extend
internal team capacity. EPAM’s “Developer’s Developer” reputation, and our
experience in product development for other eCommerce platforms helped
Magento to recognize us as partners.
Benefits for Magento:
• Outsource part of functionality to extend internal team capacity
• Deliver more new features with Merchant Beta and GA releases
Benefits for EPAM:
• Improve our knowledge of Magento 2.0
• Align development approaches and best practices with Magento core team
Goals of project
• Improve import-export functionalities for products/customers
• Implement new functionality to import/export prices
• Change obsolete file format for import/export purposes
• Optimize import/export performance
• Improve error processing for import/export operations
• All functionality should be covered with the tests and correspond to Magento
coding standards
Acceptance criteria
Import procedure should be a linear process for Magento framework and number of records in a single file should not exponentially increase process time until the bottleneck is a MySQL server itself:
Run #1 100k 30 minsimple_products = 60000configurable_products = 20000 (each configurable has 3 simple products as options)bundle_products = 10000 (each bundle product has 3 simple products as options)grouped_products = 10000 (each grouped product has 3 simple products as options) categories = 1000 categories_nesting_level = 3 Each product has 2 images attached using local storage only.Number of product attribute sets = 100Number of attributes per product = 10Total Number of attributes = 1000
Run #2 200k 1 hoursimple_products = 120000 configurable_products = 40000 (each configurable has 3 simple products as options)bundle_products = 20000 (each bundle product has 3 simple products as options) grouped_products = 20000 (each grouped product has 3 simple products as options) categories = 1000categories_nesting_level = 3 Each product has 2 images attached using local storage only.Number of product attribute sets (product templates) = 100Number of attributes per product = 10Total Number of attributes = 1000
Import process shouldn’t affect frontend load time more than 20% of average
page load, metered by JMeter
System configuration
Why it’s not so simple?
Product
Media images
Categories
Links to other
products
TaxesCustom options
Custom attributes
Complex products attributes
• Product – is a key entity for eCommerce
• DB uses EAV model for data storage
• Product has many linked entities
Producttypes
Simple
Virtual
Configurable
BundleGrouped
Virtual
Gift cards (EE only)
•sku,website_code,store_view_code,attribute_set_code,product_type,nam
e,description,short_description,weight,product_online,visibility,product_we
bsites,categories,price,special_price,special_price_from_date,special_pric
e_to_date,tax_class_name,url_key,meta_title,meta_keywords,meta_descr
iption,base_image,base_image_label,small_image,small_image_label,thu
mbnail_image,thumbnail_image_label,additional_images,additional_image
_labels,configurable_variation_prices,configurable_variation_labels,config
urable_variations,bundle_price_type,bundle_price_view,bundle_sku_type,
bundle_weight_type,bundle_values,downloadble_samples,downloadble_li
nks,associated_skus,related_skus,crosssell_skus,upsell_skus,custom_opt
ions,additional_attributes,manage_stock,is_in_stock,qty,out_of_stock_qty,i
s_qty_decimal,allow_backorders,min_cart_qty,max_cart_qty,notify_on_sto
ck_below,qty_increments,enable_qty_increments,is_decimal_divided,new
_from_date,new_to_date,gift_message_available,giftcard_type,giftcard_a
mount,giftcard_allow_open_amount,giftcard_open_amount_min,giftcard_o
pen_amount_max,giftcard_lifetime,giftcard_allow_message,giftcard_email
_template,created_at,updated_at,custom_design,custom_design_from,cus
tom_design_to,custom_layout_update,page_layout,product_options_conta
iner,msrp_price,msrp_display_actual_price_type,map_enabled
Import file sample
COLUMNS
simplesku00,,,Default,simple,"simple Product 00","simple Product 00
Description","simple Product 00 Short Description",33.14,1,"catalog,
search",base,Section3/S3Category4/SubCategory10|Section9/S9Category2/Su
bCategory1,3193.50,89.9900,02-03-15,02-03-15,"Taxable
Goods",simple00urlkey,"simple Product 00 Meta Title","simple, product","simple
Product 00 Meta Description",/mediaimport/image1.png,"Base Image
Label",/mediaimport/image2.png,"Small Image
Label",/mediaimport/image3.png,"Thumbnail Image
Label","/mediaimport/image4.png, /mediaimport/image5.png","Label 1, Label
1a",,,,,,,,,,,,simplesku0,,simplesku0,,"set9_attribute1_code =
value8,set9_attribute2_code = value6,set9_attribute3_code =
value2,set9_attribute4_code = value1,set9_attribute5_code =
value4,set9_attribute6_code = value8,set9_attribute7_code =
value7,set9_attribute8_code = value6,set9_attribute9_code =
value2,set9_attribute10_code =
value1,size=0",1,1,1000,2,0,1,1,1000,1,0,0,0,02-03-15,02-03-15,0,,,,,,,,,02-03-
15,02-03-15,"Magento Blank",02-03-15,02-04-15,,"3 columns","Product Info
Column",9,"On Gesture",1
PRODUCT DATA
One of the concepts for import
optimization
Append data to model
Prepare data for insert
Query to DB
Get imported dataRetrieve data ready
to insertCreate multi-insert
query
Standard saving process
Multi-insert process
How it’s actually working
Standard saving process
Multi-insert process
Append data to model
Prepare data for insert
Query to DB
Get imported dataCreate multi-insert
queryPrepare data for
insert
Sort products from simple to
complex
Divide full pack to bunches of 50 products in each
Import full bunch of products in
one query
Retrieve Ids of inserted/updated
products
Import connected entities one by
one
Bunch import idea
• Importing of 500k products on cluster – nearly 4-5h
• Creating URL rewrites for them – nearly 12h
• Total time: 17h
• Need to be less that 2.5h
Before optimizations takes a place
XHProf is a function-level hierarchical profiler for PHP and has a simple HTML based
navigational interface. The raw data collection component is implemented in C (as a PHP
extension). The reporting/UI layer is all in PHP. It is capable of reporting function-level
inclusive and exclusive wall times, memory usage, CPU times and number of calls for each
function. Additionally, it supports ability to compare two runs (hierarchical DIFF reports), or
aggregate results from multiple runs.
• More lightweight and faster than xDebug
• Hierarchical reports with memory and CPU usage show
• Ability to create call-graph image based on report
• Ability to create summary report based on couple of runs
T - Technology
MAIN ABILITIES
DESCRIPTION
How to implement XHProf
<?
//Initialize XHProf
xhprof_enable(XHPROF_FLAGS_CPU + XHPROF_FLAGS_MEMORY);
//Run our code
run();
//Stop profiler and retrieve profiling data
$xhprof_data = xhprof_disable();
//Generate report
include_once "/var/www/xhprof-0.9.4/xhprof_lib/utils/xhprof_lib.php";
include_once "/var/www/xhprof-0.9.4/xhprof_lib/utils/xhprof_runs.php";
$xhprof_runs = new XHProfRuns_Default();
$run_id = $xhprof_runs->save_run($xhprof_data, "test");
How reports look like
Call-graph visualization
Very bad
Not so bad
Seems normal, but…
How it looks like in Magento
• Static (one-time):
– Mostly affects small size import
– On large pack of imported products hard to find
• Linear:
– Hard to detect on small size import, because of static
bottlenecks
– Takes almost same percent on medium and big packs
• Exponential:
– Hard to find on small/medium size of import pack
– Could be detected on big pack of products
Bottlenecks, classification
― Generate queue
― Create number of workers
― Pray that it won’t affect frontend loading time
Pros:
• We could use several processor cores to increase data process speed
Cons:
• Troubles with disabled thread/system functions due to security reasons
• Potential risks to frontend loading time tests
• Quite complex mechanism to implement
• Potential risks of rows/tables lock lags due to parallel read-write to single DB
Approaches to optimization
Implement multi-processing
― Change attribute load process
― Change URL Rewrites save process
― Implement effective plugin cache
― Other small optimizations
Pros:
• We could deliver by iterations
• Less shit-code
Cons:
• We don’t know capability of such fixes to deliver
performance increase
• These changes could affect tests and core processes
Find and fix bottle-necks
• Time, quality - what should we prefer on really dirty code?
• Import/export functionality is a part of MTF (Magento testing framework) so changing it
brakes tests
• Results are affected by the size of import file
• Results varies on different DB data and we didn’t have etalon DB
• Long time to get report
• To detect exponential bottlenecks we should compare reports on different import files
• How to import related entities if we haven’t got an unique key?
• Memory usage vs. queries to DB
• How to compare elephant and fly if we don’t know real server configuration?
• XHProf lies, we cannot be sure in results and should use it only as a guideline
Difficulties in optimization
It’s a lie!
Interceptors idea
Main classMethod 1Method 2Method 3
Interceptors covered class extends Main class
Method 1Method 2Method 3
Method 1
Before plugin call
Around plugin call
After plugin call
Interceptors benchmark results
Before optimizations takes place
Plugin system performance issue
Interceptors benchmark results
After plugin system optimization
Instead of old calls
Example of optimizing static bottleneck
Load list of product types
Product left for
init?
Load attribute entities for the
product
Load data for the attribute
Start init
End init
Get next product type
yes
no
Load list of product types
Product left for
init?
Load absent attribute entities
by Id
Load data for the attributes
Start init
End init
Get next product type
yes
no
Load attributes Ids by product type
Is every attribut
e in cache?
Add an attributes to cache
Get an attributes from cache by id
no
yes
before after
Cache reusability on URL Rewrite
example
Get category from DB
Category exist?
Create category
Start creating category
Any categories
left?
End creating category
yes
yes
no
no
Start creating URL rewrites
Get category from DB
Create URL rewrite
Any categories
left?
yes
End creating URL rewrites
no
Get category from cache
Place to cache
Get category from cache
Get all categories and place to cache
Global URL Rewrite optimization
Get IDs of all inserted/update
d products
Products
exists?
Start creating URL
Rewrites
Load categories for the product
Load product attributes
Load categories attributes
Load next product by Id
End creating URL
RewritesGenerate
URLs for the product
Generate URLs for the categories
Generate URLs for the
websites
Insert URLs for current product
yes
no
Get array of products from
bunch
Products
exists?
Start creating URL
Rewrites
Get categories from the cache
Populate data for one product to object from
array
End creating URL
Rewrites
Generate URL for the product
Generate URLs for the categories
Generate URLs for the
websites
Store URLs in temporary
cache
yes
no
Multi-insert URLs from the
cache
before after
• CPU 4 physical cores 3.5GHz (2 for VM)
• L2 cache 1 Mb
• L3 cache 6 Mb
• RAM 16GB
• SATA3 HDD (64 Mb buffer)
How to compare performance?
First config
• CPU 2 physical cores with Hyper-Thread 3.2GHz (2 for VM)
• L2 cache 512 Kb
• L3 cache 4 Mb
• RAM 8GB
• SATA1 HDD (16 Mb buffer)
Total time: ~50mTotal time: ~16.5m
Second config
PROJECT RESULTS
200 000 products
Start: 12:12:16End: 12:48:05Total: ~36 minutes
Start: 13:34:49End: 13:51:08Total: ~16.5 minutes
100 000 products
Magento 2 Merchant Beta Release
We are tremendously excited to announce that today we reached another significant development
milestone with the release of the Magento 2 Merchant Beta. This release brings us to the last stage
before the general availability (GA) of Magento 2 in Q4 2015.
…
• The Enterprise Edition module includes updates to merchant features like import/export
functionality, configurable swatches, transactional emails and more.
• It demonstrates significant performance improvements for both the Magento Community
Edition and Enterprise Edition with holistic updates to both server-side and client-side architecture.
Server-side updates include out of box Varnish 4, full page caching, and support for HHVM3.6.
Client-side updates include static content caching in browser, image compression, use of jQuery,
and RequireJS for better management of JavaScript and bundling to reduce file download counts.
News link: http://magento.com/blog/technical/magento-2-merchant-beta-release
Our changes goes in release!
Recommended