emGee Software Solutions Custom Database Applications

Share this

Planet MySQL

Planet MySQL - https://planet.mysql.com
Updated: 53 min 56 sec ago

Importing Data from MongoDB to MySQL using Python

2 hours 15 min ago

MySQL Shell 8.0.13 (GA) introduced a new feature to allow you to easily import JSON documents to MySQL. The basics of this new feature were described in a previous blog post. In this blog we we will provide more details about this feature, focusing on a practical use case of interest for to many: How to import JSON data from MongoDB to MySQL.…

Categories: Web Technologies

MySQL X DevAPI Connection Pool with Connector/Python

7 hours 27 min ago

If you have an application that need to use multiple connections to the MySQL database for short periods of times, it can be a good to use a connection pool to avoid creating a new connection and going through the whole authentication process every time a connection is needed. For the Python Database API (PEP249), MySQL Connector/Python has had support for connection pools for a long time. With the release of MySQL 8.0.13, the new X DevAPI also has support for connection pools.

This blog will first cover the background of the X DevAPI connection pool feature in MySQL Connector/Python. Then provide an example.

Background

You create a connection pool using the mysqlx.get_client() function. You may wonder why you are creating a client and not a pool? As will be shown later, there is a little more to this feature than just a connection pool. So, it makes sense to use a more generic term.

The get_client() function takes two arguments: The connection options and the client options. The connection options are the usual arguments defining which MySQL instance to connect to, authentication related options, how to connect, etc. The client options are the interesting ones in the discussion of a connection pool.

The client options is a dictionary or a JSON document written as a string. Currently, the only supported client options are the ones defining the connection pool. These are specified under the pooling field (and example will be provided shortly). This leaves room for the possibility to expand get_client() later with other features than a connection pool.

There are currently four connection pool options:

  • enabled: Whether the connection pool is enabled. The default is True.
  • max_size: The maximum number of connections that can be in the pool. The default is 25.
  • max_idle_time: How long time in milliseconds a connection can be idle before it is closed. The default is 0 which means “infinite” (in practice 2147483000 milliseconds).
  • queue_timeout: The maximum amount of time in milliseconds that an attempt to get a connection from the pool will block. If no connections have become available before the timeout, a mysqlx.errors.PoolError exception is raised. The default is 0 which means “infinite” (in practice 2147483000 milliseconds).

What happens if you disable the connection pool? In that case the client that is returned simply work as a template for connections and you can keep creating connections until MySQL Server runs out of connections. In that case, the session you end up with is a regular connection, so it when you close it, it will disconnect to MySQL.

Back to the case where the connection pool is enabled. Once you have the client object, you can start using the pool. You retrieve a connection from the pool with the get_session() method. No arguments are used. After this you can use the session just as a regular standalone connection. The only difference is that when you close the session, it is returned to the pool rather than disconnected.

Enough background. Let’s see an example.

Example

The following example creates a connection pool with at most two connections. Then two sessions are fetched from the pool and their connection IDs are printed. A third session will be requested before one of the original sessions is returned to the pool. Finally, a session is reused and its connection ID is printed.

import mysqlx from datetime import datetime cnxid_sql = "SELECT CONNECTION_ID() AS ConnectionID" fmt_id = "Connection {0} ID ..........................: {1}" connect_args = { "host": "127.0.0.1", "port": 33060, "user": "pyuser", "password": "Py@pp4Demo", }; client_options = { "pooling": { "enabled": True, "max_size": 2, "max_idle_time": 60000, "queue_timeout": 3000, } } # Create the connection pool pool = mysqlx.get_client(connect_args, client_options) # Fetch two connections (exhausting the pool) # and get the connection ID for each connection1 = pool.get_session() id1_row = connection1.sql(cnxid_sql).execute().fetch_one() print(fmt_id.format(1, id1_row["ConnectionID"])) connection2 = pool.get_session() id2_row = connection2.sql(cnxid_sql).execute().fetch_one() print(fmt_id.format(2, id2_row["ConnectionID"])) # Attempt to get a third connection time = datetime.now().strftime('%H:%M:%S') print("Starting to request connection 3 .........: {0}".format(time)) try: connection3 = pool.get_session() except mysqlx.errors.PoolError as err: print("Unable to fetch connection 3 .............: {0}".format(err)) time = datetime.now().strftime('%H:%M:%S') print("Request for connection 3 completed .......: {0}".format(time)) # Return connection 1 to the pool connection1.close() # Try to get connection 3 again connection3 = pool.get_session() id3_row = connection3.sql(cnxid_sql).execute().fetch_one() print(fmt_id.format(3, id3_row["ConnectionID"])) # Close all connetions pool.close()

The first thing to notice is the client options defined in lines 14-21. In this case all four options are set, but you only need to set those where you do not want the default value. The settings allow for at most two connections in the pool, when requesting a session it is allowed to take at most 3 seconds, and idle sessions should be disconnected after 60 seconds.

In line 24 the connection pool (client) is created and subsequent two sessions are fetched from the pool. When a third session is requested, it will trigger a PoolError exception as the pool is exhausted. Lines 38-42 shows how to handle the exception.

Finally the first connection is returned to the pool and it is possible to get the third request to complete.

An example of the output is (the connection IDs and timestamps will differ from execution to execution):

Connection 1 ID ..........................: 239 Connection 2 ID ..........................: 240 Starting to request connection 3 .........: 18:23:14 Unable to fetch connection 3 .............: pool max size has been reached Request for connection 3 completed .......: 18:23:44 Connection 3 ID ..........................: 241

From the output you can see that the first attempt to fetch connection 3 takes three seconds before it times out and raises the exception – just as specified by the queue_timeout setting.

What may surprise you (at least if you have studied Chapter 5 from MySQL Connector/Python Revealed) from this output is that once connection 1 has been returned to the pool and connection 3 fetches the session again, it has a new connection ID. Does that mean the pool is not working? No, the pool is working alright. However, the X Plugin (the plugin in MySQL Server handling connections using the X Protocol) works differently than the connection handling for the traditional MySQL protocol.

The X Plugin distinguishes between the connection to the application and the thread inside MySQL. So, when the session is returned to the pool and the session is reset (to set the session variables back to the defaults and remove user variables) the thread inside MySQL is removed. As MySQL uses threads, it is cheap to create a new thread as it is needed, so this is not a performance problem. However, the connection to the application is maintained. This means you safe the expensive steps of creating the connection and authenticating, while the threads only actually exists inside MySQL while it is out of the pool.

If you are interested in learning more about MySQL Connector/Python 8 including how to use the X DevAPI, then I am the author of MySQL Connector/Python Revealed (Apress). It is available from Apress, Amazon, and other book stores.

Categories: Web Technologies

MySQL 8.0.13: Change Current Password Policy

Wed, 11/14/2018 - 23:30

We have introduced a new policy for you to enforce on your non-privileged users. It requires their current password at the time they set a new password. It is optional and off by default. You can control it globally (for all non-privileged users) or on a per-user basis.…

Categories: Web Technologies

MySQL NDB Cluster row level locks and write scalability

Wed, 11/14/2018 - 23:08
MySQL NDB Cluster uses row level locks instead of a single shared commit lock in order to prevent inconsistency in simultaneous distributed transactions. This gives NDB a great advantage over all other MySQL clustering solutions and is one reason behind cluster’s unmatched ability to scale both reads and writes. 
NDB is a transactional data store. The lowest and only isolation level available in NDB is Read Committed. There are no dirty reads in NDB and only committed rows can be read by other transactions. 
All write transactions in NDB will result in exclusive row locks of all individual rows changed during the transaction. Any other transaction is allowed to read any committed row independent of their lock status. Reads are lock-free reads.
The great advantage is that committed reads in NDB never block during writes to the same data and always the latest committed changes are read. A select doesn't block concurrent writes and vice versa. 
This is extremely beneficial for write scalability. No shared global commit synchronization step is needed to ensure transaction consistency across distributed data store instances. Each instance instead handles its own row locks - usually only locking a few out of many rows. Due to NDB’s highly parallel and asynchronous design many rows can be committed in parallel within a distributed instance and across multiple instances. 
As a side effect interleaved reading transactions can read committed rows of write transactions before all rows of that writing transaction are committed. The set of rows returned may represent a partially committed transaction and not a snapshot of a single point in time. Pending transactions never change the state of the data before they are committed. All rows of committed transactions are atomically guaranteed to be network durable and consistent in all distributed instances of the data.
If more consistent reads are needed then read locks used in SELECT... IN SHARE MODE / SELECT .... FOR UPDATE can be used to get a serialized view of a set of rows.

Categories: Web Technologies

MySQL Master Replication Crash Safety Part #2: lagging slaves

Wed, 11/14/2018 - 21:59
This is Part #2 of the MySQL Master Replication Crash Safety series.  In the previous post, we explored the consequence of reducing durability on masters with slaves using legacy file+position replication.  The consequences are data inconsistencies with a clear warning sign: the slaves stop replicating and report an error.  In this post, we extend our understanding of the impact of running a
Categories: Web Technologies

Migrating to Amazon Aurora: Reduce the Unknowns

Wed, 11/14/2018 - 13:28

Migrating to Amazon Aurora. Shutterstock.com

In this Checklist for Success series, we will discuss reducing unknowns when hosting in the cloud using and migrating to Amazon Aurora. These tips might also apply to other database as a service (DBaaS) offerings.

While DBaaS encapsulates a lot of the moving pieces, it also means relying on this approach for your long-term stability. This encapsulation is a two-edged sword that takes away your visibility into performance outside of the service layer.

Shine a Light on Bad Queries

Bad queries are one of the top offenders of downtime. Aurora doesn’t protect you against them. Performing a query review as part of a routine health check of your workload helps ensure that you do not miss looming issues. It also helps you predict the workload on specific times and events. For example, if you already know your top three queries tend to exponentially increase, and are read bound, you can easily decide to increase the number of read-replicas on your cluster.

Having historical query performance data helps makes this task easier and less stressful. While historical data allows you to look backward, it’s also very valuable to have a tool that lets you look at active incident scenarios in progress. Knowing what queries are currently running when suffering from performance issues reduces guesswork and helps solve problems faster.

Pick Your Tool(s)

There are a number of ways you can achieve query performance excellence. Performance Insights is a built-in offering from AWS that is tightly integrated with RDS. It has a seven-day free retention period, with an extra cost beyond that. It is available for each instance in a cluster. Performance Insights takes most of its metrics from the Performance_Schema. It includes specific metrics from the operating system that may not be available from regular Cloudwatch metrics.

Query Analytics from Percona Monitoring and Management (PMM) also uses the same source as Performance Insights: the Performance Schema. Unlike Performance Insights though, PMM is deployed separately from the cluster. This means you can keep your metrics even if you keep recycling your cluster instances. With PMM, you can also consolidate your query reviews from a single location, and you can monitor your cluster instances from the same location – including an extensive list of performance metrics.

You can enable Performance Insights and configure for the default seven-day retention period, and then combine with PMM for longer retention period across all your cluster instances. Note though that PMM may incur a cost for additional API calls to retrieve performance insight metrics.

Outside of the built-in and open source alternative, VividCortex, NewRelic and Datadog are excellent tools that do everything we discussed above and more. NewRelic, for example, allows you to take a good view of the database, application and external requests timing. This, in my opinion, is so very valuable.

Bad queries are not only the potential unknowns. Deleted rows, dropped tables, crippling schema changes, and even AZ/Region failures are realities in the cloud. We will discuss them next! Stay “tuned” for part two.

Meanwhile, we’d like to hear your success stories in Amazon Aurora in the comments below!

Categories: Web Technologies

Geo-distributed multi-master Galera Cluster on Amazon EC2. Join Codership at Amazon re:Invent Las Vegas 26-29 November

Wed, 11/14/2018 - 02:32

Implementing Galera Cluster on Amazon EC2  provides several advantages for companies. Multi-master clustering comes out of the box as a default in Galera. Users can read and write from any database providing fast local writes and reads. Galera handles conflicts automatically if your application is writing to the same row. Nodes can be added and deleted on the fly if you need more capacity and elasticity. New nodes can join automatically. In the case data center downtime Galera provides disaster recovery and Galera Cluster takes care of the split brain situation too. Automated failover and recovery allow the database to continually service the application transparently.

Meet the developers and experts of Galera Cluster in Amazon re:Invent Las Vegas 26-29 November. You can find us at Venetian hotel, booth number 738.

Categories: Web Technologies

Delayed Replication with Amazon RDS

Tue, 11/13/2018 - 22:13

Delayed replication” is one of the important features which were being supported in MySQL from 5.6 for a very long time. This induces an intentional lag in the slave, making it lag by the defined time interval.

For a long time this was not available with the RDS version of MySQL provided by AWS, Recently from the version 5.6.40, 5.722 and later versions this feature is available with all the regions.

I will give a small intro on Amazon RDS, Which is DBAAS provided by Amazon, where you will be given an end-point for all your DB operations and major of admin task of server and DB is taken care by Amazon, To know more you can view our presentation here

In this blog, I will demonstrate, how to have a delayed slave with Amazon RDS for MySQL

Note: If you are running with older versions of MySQL, please patch-up to 5.6.40 and 5.7.22. It’s pretty simple and straightforward.

Below is the environment for this demo.

Instance Type: M4.large (8GB & 2 VCPU)
MySQL Version : 5.6.40 Source distribution

Am inducing intentional lag in an existing slave, which is already in sync with the master.

Stopping the replication

mysql> CALL mysql.rds_stop_replication; +---------------------------+ | Message | +---------------------------+ | Slave is down or disabled | +---------------------------+ 1 row in set (1.02 sec) Query OK, 0 rows affected (1.02 sec) Add delay-interval in secs: mysql> call mysql.rds_set_source_delay (3600); +-----------------------------------+ | Message | +-----------------------------------+ | source delay is set successfully. | +-----------------------------------+ 1 row in set (0.35 sec) Query OK, 0 rows affected (0.35 sec)

Once After adding the delay, you can see a stall in the SQL thread and lag starting to build up in the slave status.

Exec_Master_Log_Pos: 24399479 Relay_Log_Space: 124626119 Seconds_Behind_Master: 244 SQL_Delay: 3600 SQL_Remaining_Delay: 3356 Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed even

Once the slave is delayed with the required time interval of 1 hour, you can find the status as below.

Exec_Master_Log_Pos: 16928918 Relay_Log_Space: 2670652450 Seconds_Behind_Master: 3600 SQL_Delay: 3600 SQL_Remaining_Delay: 0 Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed even.

You can also check the status as below.

mysql> call mysql.rds_show_configuration; +------------------------+-------+-----------------------------------------------------------------------------------------------------------+ | name | value | description | +------------------------+-------+-----------------------------------------------------------------------------------------------------------+ | binlog retention hours | 4 | binlog retention hours specifies the duration in hours before binary logs are automatically deleted. | | source delay | 3600 | source delay specifies replication delay in seconds between current instance and its master. | | target delay | 0 | target delay specifies replication delay in seconds between current instance and its future read-replica. | +------------------------+-------+-----------------------------------------------------------------------------------------------------------+ Benefits:

Below are some of the benefits of having a delayed slave.

  • Recovery of data during a human error, Such as accidental delete or drop of a table
  • As a backup server
  • For ETL/ Analytics with acceptable delay.

Thanks for your time and attention. Next waiting for the filtered replication to be implemented with RDS, for solving many of the use-cases.

Categories: Web Technologies

Pager for the New MySQL Shell

Tue, 11/13/2018 - 13:19
I love the new shell but the one thing I missed from the old shell was the ability to use a pager like more or less to throttle the output of the screen.  Low and behold the engineers have added paging in MySQL Shell 8.0.13!!

Turn Paging On
Things can scroll off the screen quickly and pagination programs keep the output to small chunks.  To pick the more program as you pagination program, simply enter \pager more or for less enter \pager less

See the first illustration



How to turn paging on and off with the MySQL Shell 8.0.13


Yeah!

A big thanks to the MySQL Engineers!!

Turn Paging OffAnd sometimes you do not need pagination and simply typing \pager and return will clear the setting. See the first illustration.


And you can determine how many lines to be output before output stops. \pager more -6 stops every six lines of output.

Setting the output of more to six lines

Other Pagination ProgramsYou can declare your desire to use another program for pagination on the command line like so mysqlsh --pager=foo

Tip: if you set the pager to \pager less -S, it will page horizontally too, so you can view wide results even without \G
The Documentation

The manual page for the pager is ->Using a Pager





Categories: Web Technologies

ProxySQL 1.4.12 and Updated proxysql-admin Tool

Tue, 11/13/2018 - 10:54

ProxySQL 1.4.12, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.12 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.12 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases. GitHub hosts the documentation in the wiki format.

Improvements
  • #68: Scripts are now compatible with Percona XtraDB Cluster (PXC) hosts using IPv6
  • #107: In include-slaves, a slave would always be moved into the write hostgroup even if the whole cluster went down. A new option --use-slave-as-writer specifies whether or not the slave is added to the write hostgroup.
Bugs Fixed
  • #110: In some cases, pattern cluster hostname did not work with proxysql-admin.
  • #104: proxysql-admin testsuite bug fixes.
  • #113: proxysql_galera_checker assumed that parameters were given in the long format
  • #114: In some cases, ProxySQL could not be started
  • #115: proxysql_node_monitor could fail with more than one command in the scheduler
  • #116: In some cases, the scheduler was reloading servers on every run
  • #117: The --syncusers option did not work when enabling cluster
  • #125: The function check_is_galera_checker_running was not preventing multiple instances of the script from running

Other bugs fixed: #112#120

ProxySQL is available under Open Source license GPLv3.

Categories: Web Technologies

Druid @ Airbnb Data Platform

Tue, 11/13/2018 - 09:46
How Druid enables analytics at Airbnb Realtime and batch analytics at Airbnb and the role Druid plays in our analytics system architecture

By Pala Muthiah and Jinyang Li

Wikipedia: A druid was a member of the high-ranking professional class in ancient Celtic cultures. It is believed that druids perform secret rituals in forests, not unlike this unique Experience on Airbnb!

Airbnb serves millions of guests and hosts in our community. Every second, their activities on Airbnb.com, such as searching, booking, and messaging, generate a huge amount of data we anonymize and use to improve the community’s experience on our platform.

The Data Platform Team at Airbnb strives to leverage this data to improve our customers’ experiences and optimize Airbnb’s business. Our mission is to provide infrastructure to collect, organize, and process this deluge of data (all in privacy-safe ways), and empower various organizations across Airbnb to derive necessary analytics and make data-informed decisions from it.

The primary way high-level analytics is exposed and shared within the company is through various dashboards. A lot of people use these dashboards every day to make various decisions. Dashboards also allow real-time tracking and monitoring of various aspects of our business and systems. As a result, the timeliness of these dashboards is critical to the daily operation of Airbnb. However, we are faced with three challenges:

First, it would take a long time to aggregate data in the warehouse and generate the necessary data for these dashboards using systems like Hive and Presto at query time. Hive/Presto has to read all the data and aggregate them on demand, resulting in all necessary computation getting invoked at query time. Even if those engines are used to pre-compute the aggregation and store them, the storage format is not optimized for repeated slicing and dicing of data that analytics queries demand.

Second, the system needs to be reliable and scalable. It is powering core analytics use cases at Airbnb, hence any downtime will have severe impact on the business and its employees. Also, the volume of data, queries, and users continue to grow and our analytics system should be able to cope with increasing demand.

Third, we need a system that integrates well with our data infrastructure that is based on open source frameworks. For example, most of our datasets are stored in Hadoop, and we use Kafka and Spark Streaming to process our data streams.

This is where Druid comes in.

Advantages of Druid Fast query time

With predefined data-sources and pre-computed aggregations, Druid offers sub-seconds query latency. The dashboards built on top of Druid can be noticeably faster than those built on others systems. Compared to Hive and Presto, Druid can be an order of magnitude faster.

Architecture that Offers Reliability and Scalability

Druid architecture is well separated out into different components for ingestion, serving, and overall coordination. We have found this componentized architecture to be reliable and stable for our workload, and it has allowed us to scale the system easily as needed.

Druid’s architecture of separating data storage into deep storage for long term storage of data while caching the data temporarily in historical node has worked well for us. Keeping the analytical data permanently in S3 gives us disaster recovery for free and allows us to easily manage upgrade and upkeep of cluster hardware (e.g. easily switch node types to take advantage of latest hardware).

Integration with Open Source Frameworks

Druid also integrates smoothly with the open source data infrastructure that’s primarily based on Hadoop and Kafka:

  1. Druid’s API allows us to easily ingest data from Hadoop for batch analytics
  2. Druid enables real-time analytics via stream processing engines. Druid provides a streaming client API, Tranquility, that is integrated with streaming engines such Samza or Storm and can be integrated with any other JVM based streaming engine. At Airbnb, streaming ingestion of data into Druid for real time analytics is implemented through Spark Streaming job employing the Tranquility client.
  3. Druid is well integrated with Apache Superset, an open source data visualization system developed and open sourced by Airbnb. Superset serves as the interface for users to compose and execute analytics queries on Druid and visualize the results.
How Airbnb Uses Druid: Dual Cluster Configuration

At Airbnb, two Druid clusters are running in production. Two separate clusters allow dedicated support for different uses, even though a single Druid cluster can handle more data sources than what we need. In total we have 4 Brokers, 2 Overlords, 2 Coordinators, 8 Middle Managers, 40 Historical nodes. In addition, our clusters are supported by one MySQL server and one ZooKeeper cluster with 5 nodes. Druid clusters are relatively small and low cost comparing with other service clusters like HDFS and Presto.

Of the two Druid clusters, one is dedicated to centralized critical metrics services. With the goal of serving all the dashboards at Airbnb, users can easily define their metrics through simple YAML files. Users can view their dashboards and metrics on Superset without knowing anything about Druid.

All the batch jobs are scheduled with Airflow, ingesting data from our Hadoop cluster.

All the real-time and other data sources for self-service users are handled by the other Druid cluster. Real-time data are ingested through Spark Streaming + Tranquility client setup.

Improving Druid Usage at Airbnb

While Druid provides many powerful widely applicable features that satisfy most businesses, we did implement features within or on top of Druid to better serve our special use cases.

A Framework for Instantly Answering Ad-hoc Analytics Queries

Airbnb has a large number of data scientists embedded in different business teams. Each of them may have ad-hoc questions about the business that need insight derived from the data, which often requires arbitrary ways to aggregate data.

To meet this need, we have built a self-service system on top of Druid that allows individual teams to easily define how the data their application or service produces should be aggregated and exposed as a Druid data source. Data scientists and analysts can then query Druid to answer ad-hoc questions.

User define their data source with configuration as simple as below. Real-time data from Kafka and batch data from HDFS/S3 will be ingested according to the config file.

Druid aggregates its real-time data over windows of 5 minutes, plus 1-minute latency from pipelines.

The real-time streaming from Druid empower us to enable a number of sophisticated functionalities for our users. One of interesting use cases for real-time ingestion is anomaly detection. With real-time data ingested and aggregated quickly in Druid, we can detect anything in production that does not conform to an expected pattern very quickly.

Integration with Presto

Druid has a mature query mechanism with JSON over HTTP RESTful API, in addition to SQL query support with recent versions. However, one of the limitations of Druid is that it does not yet allow cross data source queries (simplistically speaking, a join query). All the aggregate queries are limited to a single data-source. In Airbnb however, we do have scenarios where multiple data-sources with overlapping dimensions need to joined together for certain queries. The alternative is to keep all the data in one single data-source, which is not optimal in our scenario for various reasons including cadence of data generation, source of data being different (e.g. different services produce the data) and so on. However, the need for cross data source query is real and has recently become a hard requirement.

To cater to these scenarios, we have developed an in-house solution that is based on Presto. Specifically, we introduced Presto connector for Druid that can push down queries to Druid by individual data sources, and can retrieve and join the data to complete the execution of the cross data-sources query. The details of the implementation are still evolving and is out of scope for this article. We will provide more details in a separate post in the future.

Improve backfill performance

The secret why Druid queries are much faster than other systems comes at the cost of ingestion. Every data segment needs to be ingested from MapReduce jobs first before it is available for queries. This works great as write-once-read-multiple-times model, and the framework only needs to ingest new data on a daily basis.

However, problems arise when an owner of a data source wants to redesign it and regenerate historical data. It means data over the past years need to be re-ingested into Druid to replace the old ones. This requires a very large ingestion job with a long running MapReduce task, making it expensive especially when error happens in the middle of re-ingestion.

One potential solution is to split the large ingestion into several requests in order to achieve better reliability. However, query results will be inconsistent as it will be computed from a mix of existing old as well as newly ingested data. Backfill jobs are actually more frequent than we expected as user requirements and ingestion framework functionalities evolve, making its performance a pain point that begs improvement.

To solve this, we have designed a solution that basically keeps all the newly ingested segments inactive until explicit activation. This enables ingestion framework to split the source of data into smaller intervals with acceptable sizes. The framework then ingests these intervals in parallel (as parallel as Yarn cluster resources allow). Since the newly ingested data is still inactive, the segments are hidden in the background and there’s no mix of different versions of data when computing results for queries being executed while backfill ingestion is still in progress. When we activate the latest version of segments for the data source, it will be refreshed with the new version without downtime. Split and refresh greatly improved backfill performance and has made backfills that used to run longer than a day to now finish in one hour.

Monitoring and operation

We monitor Druid continuously for reliable service and best performance. Druid is robust and resilient to node failures. Most nodes failures are transparent and unnoticeable to users. Even if a role that is a single point of failure (like Coordinator, Overlord, or even ZooKeeper) fails, Druid cluster is still able to provide query service to users. However, to honor our SLA with users, any service outage should be caught in time or even before failure happens.

Like other clusters, we monitor every machine in the Druid clusters by collecting machine statistics and raising an alert if any instance reaches its capacity or goes into bad state. To monitor overall cluster availability, we ingest one piece of canary data into Druid every 30 minutes, and check if query result from each Broker node matches the latest ingested data every 5 minutes. Any degradation in service, including query, ingestion, or downstream HDFS instability, can be detected within the SLA.

Druid has been running at Airbnb for years and it is one of the systems with the lowest maintenance cost. Druid’s multi-role design makes operations easy and reliable. Cluster administrator can adjust cluster configuration and add/remove nodes based on the monitoring metrics. As data grows in our Druid cluster, we can continue adding historical node capacity to cache and serve the larger amount of data easily. If real-time ingestion workload shows an uptick, we can easily add middle manager nodes accordingly. Similarly, if more capacity is needed to handle queries, we can increase the broker node count. Thanks to Druid’s decoupled architecture, we have done a large operation that migrates all data in deep storage from HDFS to S3 with newly rebuilt cluster, with only minutes downtime.

Challenges and future improvements

While Druid has served us well in our data platform architecture, there are new challenges as our usage of Druid grows within the company.

One of the issues we deal with is the growth in the number of segment files that are produced every day that need to be loaded into the cluster. Segment files are the basic storage unit of Druid data, that contain the pre-aggregated data ready for serving. At Airbnb, we are encountering a few scenarios where a large number of our data sources sometime need to be recomputed entirely, resulting a large number of segment files that need to be loaded at once onto the cluster. Currently, ingested segments are loaded by coordinators sequentially in a single thread, centrally. As more and more segments are produced, the coordinator is unable to keep up and we see increasing delay between the time an ingestion job completes and the time the data becomes available for querying (after being loaded by the coordinator). Sometimes the delay can be hours long.

The usual solution is to attempt to increase the target segment size and thus reduce segment count. However, in our usage, the input volume of data to produce a larger segment (by a Hadoop worker running ingestion task) is so high that the Hadoop job would run for too long crunching that data, and many times would fail due to various reasons.

We are currently exploring various solutions, including compacting segments right after ingestion and before it is handed off to the coordinator, and different configurations to increase the segment size without jeopardizing the ingestion job stability when possible.

Conclusion

Druid is a big data analytics engine designed for scalability, maintainability, and performance. Its well factored architecture allows easy management and scaling of Druid deployment, and its optimized storage format enables low latency analytics queries. We have successfully deployed Druid at Airbnb for our use cases and see continued growth in its footprint as our user base and use cases grow.

Airbnb’s Data Platform team is always looking for good engineers with relevant skills! If you are enthusiastic about building out data infrastructure like this one and interested in joining the team, please check out our open positions and send your application!

Druid @ Airbnb Data Platform was originally published in Airbnb Engineering & Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Categories: Web Technologies

Build a React To-Do App with React Hooks (No Class Components)

Tue, 11/13/2018 - 07:52

Yes, a to-do app, seems like something we’ve all done in the past. The great thing about a to-do app is that it covers all 4 aspects of CRUD; Create, Read, Update, Delete. As a developer, we need to know CRUD and a to-do app is the best way to start that learning or refresh some of the things we may be a little rusty on.

React is the frontend JavaScript library we are going to be using today. Whether you are new to React or are an avid React user, we can all agree on one thing: React is pretty great.

We've already done a React To-Do App here at Scotch, but that was with Class components. Today we will have no Class components by integrating React Hooks, React’s newest upgrade feature. Let’s get started.

https://codesandbox.io/s/oj3qm2zq06

React Hooks

React is always improving and growing. The latest update is React 16.7, so new it’s in alpha stages. In this upgrade, we are given the power of React Hooks.

React Hooks allow for functional components to have a state and utilize lifecycle methods.

No longer do we need to rely on just class components for that functionality.

You can learn all about React Hooks here or visit React’s docs on Hooks here.

Starting a React App

Navigate to the place you would like your new application to be located and type:

npx create-react-app react-to-do

Note: Running npxbefore the command allows for it to be installed if it is not already installed globally on your machine.

Sit back and relax, React is taking care of the initial build for your new application.

Once it is done, you will need to navigate into the project:

cd react-to-do

and then to run the project:

npm run start

and navigate to http://localhost:3000/ to see the spinning React logo in your browser.

Styling Your Application

Jump into your src/App.css file and add in the three classes we will be using throughout our app. Styling won't be the focus of this app, so we'll keep this short and sweet.

src/App.css .app { background: #209cee; padding: 30px; height: 100vh; } .todo-list { background: #e8e8e8; border-radius: 4px; padding: 5px; max-width: 400px; } .todo { background: #fff; box-shadow: 1px 1px 1px rgba(0, 0, 0, 0.15); padding: 3px 10px; font-size: 12px; margin-bottom: 6px; border-radius: 3px; display: flex; align-items: center; justify-content: space-between; } Reading a List of Items. cRud.

With your application running and the styling ready to be used, let’s start on the Read part of CRUD. We’ll want to make a list of things, just so we can Read/view the list.

Adding in State

Go into your src/App.js file and let’s add a state to our component. We are going to be using React Hooks so state will look a little different than what you're used to in classes.

src/App.js function App() { const [todos, setTodos] = useState([ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ]); // we'll render our todos here ... // return <div></div> }

The component, as we can see, is a functional component. In past versions of React, function components were unable to handle state, but now, by using Hooks, they can.

  • The first parameter, todos is what we are going to name our state.
  • The second parameter, setTodos is what we are going to use to set the state.

We've got a writeup on array destructuring if you want to know more info about that [todos, setTodos] syntax here.

The hook of useState is what React uses to "hook" into the state or lifecycle of the component. We then create an array of objects and we have the beginnings of our state.

Comparing to a Class Component

Let's take a quick detour and see how this would've done with classes:

class App extends Component { state = { todos: [ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ] } setTodos = todos => this.setState({ todos }); render() { return <div></div> } }

A lot more typing. React Hooks let's us make that really clean! We'll continue with our functional component version from above.

Our "Todo" Component

We will want to create a component that we can use later on in the return of the main App component. We will call that Todo and it will pass in the (todo) and show the “text” part of the todo (todo.text), like so:

const Todo = ({ todo }) => <div className="todo">{todo.text}</div>;

Let’s see how we will use that in our App component.

Using Our Todo Variable to Get a List Returned

Go down to the return part of the App component and remove almost everything. We want to empty out that return part especially so that the spinning logo we saw earlier when we navigated to http://localhost:3000, goes away and we have our list being displayed on the page.

By using the JavaScript method, map(), we are able to create a new array of items by mapping over the todo items from state and displaying them by index.

Let’s create a pretty list of items:

src/App.js return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} /> ))} </div> </div> );

Navigate to your browser and you should see something like this:

Creating New Items to the To-Do List - Crud

Want to create a new item to the list? What if we forgot to add something to our to-do list and don’t want to forget that thing? Let’s give our application the power to Create a new item for our to-do app.

While in the src/App.js file, we are going to want to add a couple of things. At the top we are going to add another component, we’ll call it TodoForm. In this component we want it to:

  • Start with an empty state for an input field.
  • Be able to update the form by setting the state.
  • Handle the submit.
Setting our Empty State for the Form Input

Remember, we are using React Hooks so state is going to be a little different. To set our state we are going to want to write it like so:

const [value, setValue] = useState("");

The first is the "value" and the second is how we are going to be setting the state. The state starts off empty and as we add things to our state, it will add it to our list of to-do items.

We will want to add in a handleSubmit variable that can handle our addTodo function (we will make that function soon) and add the item to the list. If nothing is in the input box and the user presses “enter”, we want it to not do anything (i.e., not add in an empty tile to the list).

Adding that functionality into a form that has an input box, we should have our code look like this:

src/App.js function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); }

The addTodo function I told you about earlier? Let’s go ahead and build that now. Staying within App.js, under the state of the App component, the function should be able to grab the existing list of items, add on the new item, and display that new list.

src/App.js const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); };

Notice the lack of this.state.? With the new React Hooks, we have no more need to use that. Can you use it? Sure, of course. But the new Hooks allow for less typing, more efficiency, and understand that this.state. is going to be implied in certain places.

See that spread operator? The three dots before the todos, that is essentially "copying" the list for us so that we are able to add on the new to-do item. Then using our keyword that we set earlier, we will set the state with setTodos.

By using the TodoForm down in the return of the App component, we will see that input box pop up now. The entire src/App.js file should look like this so far:

src/App.js import React, { useState } from "react"; import "./App.css"; const Todo = ({ todo }) => <div className="todo">{todo.text}</div>; function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); } function App() { const [todos, setTodos] = useState([ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ]); const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); }; return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} /> ))} <TodoForm addTodo={addTodo} /> </div> </div> ); } export default App;

Go to your browser and play around. You can now add in a to-do item to your list!

Updating Items in the To-Do List to be Completed - crUd

How would we want to update our to-do application? Maybe let’s have the functionality of being able to cross off an item. Let’s get started on that code!

Updating our State

Our state in our App component needs a little extra to it for the "Completed" status to be able to change. We will be adding in another key/value pair to our list of objects. By adding in an "isCompleted: false" value, we set that to false to begin with and will, when prompted, change that to true.

src/App.js const [todos, setTodos] = useState([ { text: "Learn about React", isCompleted: false }, { text: "Meet friend for lunch", isCompleted: false }, { text: "Build really cool todo app", isCompleted: false } ]);

We will need a function like the addTodo function but this one will be able to “complete” an item. We will want to do some similar things that we did in the addTodo like using the spread operator to grab the current list of items. In this function, we will be changing the isCompleted status to true so that it knows it has been completed. It will then update the state and set the state to the newTodos.

src/App.js const completeTodo = index => { const newTodos = [...todos]; newTodos[index].isCompleted = true; setTodos(newTodos); };

By using completeTodo in the Todo function, we are going to be able to fire off that functionality. When the “Complete” button is clicked, it will add in the textDecoration styling and cross-out the item. We are using a ternary operator, a feature within ES6 JavaScript, which is a simpler way of doing an if/else statement. This is our way of completing an item on the list and “updating” the list. The code should look as follows:

src/App.js function Todo({ todo, index, completeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> </div> </div> ); }

Dive down to the return of the App component and we’ll add in the following line:

completeTodo={completeTodo}

to look like this in the code:

src/App.js <div className="card-content"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} /> ))} </div>

Returning to the browser, your to-do app should look something like this when a “Complete” button is clicked.

Now we can read our list, add to our list, and update the completed status of each item. What’s left? Delete an item.

Deleting a To-Do Item - cruD

So you’ve completed an item on your to-do list, the day is over and you want to delete it from your list to be able to start over tomorrow. We are going to want to delete that item completely. Let’s see how we can get that going.

By adding just a couple lines, we will be able to add in the functionality of deleting an item.

We will go ahead and build the removeTodo function so that when we click on an “X” to delete an item, that will be fired off. That function will be located by the others underneath the state of the App component.

In this removeTodo function, we will again be using the spread operator but once we grab that current list, we will be "splicing" the chosen index off of the array of items. Once that is removed, we will return the new state by setting it with setTodos to be newTodos.

src/App.js const removeTodo = index => { const newTodos = [...todos]; newTodos.splice(index, 1); setTodos(newTodos); };

In your Todo function, you will want to add in this line:

<button onClick={() => removeTodo(index)}>x</button>

like this:

src/App.js function Todo({ todo, index, completeTodo, removeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> <button onClick={() => removeTodo(index)}>x</button> </div> </div> ); }

You’ll see as well that we are bringing in removeTodo at the top and then using it in the onClick of the “X”.

Adding in the removeTodo in the Todo part of the returning the App component, our “delete” will be fully functional. Add it in here:

src/App.js <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} removeTodo={removeTodo} />

With that added in, go to your browser and you’ll see a button with an "X" that when clicked, deletes the item completely.

The Final Product

The entire src/App.js file should look like this in the end:

src/App.js import React, { useState } from "react"; import "./App.css"; function Todo({ todo, index, completeTodo, removeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> <button onClick={() => removeTodo(index)}>x</button> </div> </div> ); } function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); } function App() { const [todos, setTodos] = useState([ { text: "Learn about React", isCompleted: false }, { text: "Meet friend for lunch", isCompleted: false }, { text: "Build really cool todo app", isCompleted: false } ]); const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); }; const completeTodo = index => { const newTodos = [...todos]; newTodos[index].isCompleted = true; setTodos(newTodos); }; const removeTodo = index => { const newTodos = [...todos]; newTodos.splice(index, 1); setTodos(newTodos); }; return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} removeTodo={removeTodo} /> ))} <TodoForm addTodo={addTodo} /> </div> </div> ); } export default App; What have we learned?

A to-do app can be a great reminder or starting point when it comes to CRUD in web development. Being able to read information, create new information, update existing information, and deleting said information can be powerful in any application.

React Hooks are great. They allow for a more straight-forward way of coding and can make your code clear and concise.

Now go have fun adding all your many to-do items to your newly built to-do app. And then have even more fun crossing them off when you finish them!

Happy coding!

Categories: Web Technologies

Build a To-Do App with React Hooks (No Class Components)

Tue, 11/13/2018 - 07:52

Yes, a to-do app, seems like something we’ve all done in the past. The great thing about a to-do app is that it covers all 4 aspects of CRUD; Create, Read, Update, Delete. As a developer, we need to know CRUD and a to-do app is the best way to start that learning or refresh some of the things we may be a little rusty on.

React is the frontend JavaScript library we are going to be using today. Whether you are new to React or are an avid React user, we can all agree on one thing: React is pretty great.

We've already done a React To-Do App here at Scotch, but that was with Class components. Today we will have no Class components by integrating React Hooks, React’s newest upgrade feature. Let’s get started.

https://codesandbox.io/embed/oj3qm2zq06

React Hooks

React is always improving and growing. The latest update is React 16.7, so new it’s in alpha stages. In this upgrade, we are given the power of React Hooks.

React Hooks allow for functional components to have a state and utilize lifecycle methods.

No longer do we need to rely on just class components for that functionality.

You can learn all about React Hooks here or visit React’s docs on Hooks here.

Starting a React App

Navigate to the place you would like your new application to be located and type:

npx create-react-app react-to-do

Note: Running npxbefore the command allows for it to be installed if it is not already installed globally on your machine.

Sit back and relax, React is taking care of the initial build for your new application.

Once it is done, you will need to navigate into the project:

cd react-to-do

and then to run the project:

npm run start

and navigate to http://localhost:3000/ to see the spinning React logo in your browser.

Styling Your Application

Jump into your src/App.css file and add in the three classes we will be using throughout our app. Styling won't be the focus of this app, so we'll keep this short and sweet.

src/App.css .app { background: #209cee; padding: 30px; height: 100vh; } .todo-list { background: #e8e8e8; border-radius: 4px; padding: 5px; max-width: 400px; } .todo { background: #fff; box-shadow: 1px 1px 1px rgba(0, 0, 0, 0.15); padding: 3px 10px; font-size: 12px; margin-bottom: 6px; border-radius: 3px; display: flex; align-items: center; justify-content: space-between; } Reading a List of Items. cRud.

With your application running and the styling ready to be used, let’s start on the Read part of CRUD. We’ll want to make a list of things, just so we can Read/view the list.

Adding in State

Go into your src/App.js file and let’s add a state to our component. We are going to be using React Hooks so state will look a little different than what you're used to in classes.

src/App.js function App() { const [todos, setTodos] = useState([ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ]); // we'll render our todos here ... // return <div></div> }

The component, as we can see, is a functional component. In past versions of React, function components were unable to handle state, but now, by using Hooks, they can.

  • The first parameter, todos is what we are going to name our state.
  • The second parameter, setTodos is what we are going to use to set the state.

We've got a writeup on array destructuring if you want to know more info about that [todos, setTodos] syntax here.

The hook of useState is what React uses to "hook" into the state or lifecycle of the component. We then create an array of objects and we have the beginnings of our state.

Comparing to a Class Component

Let's take a quick detour and see how this would've done with classes:

class App extends Component { state = { todos: [ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ] } setTodos = todos => this.setState({ todos }); render() { return <div></div> } }

A lot more typing. React Hooks let's us make that really clean! We'll continue with our functional component version from above.

Our "Todo" Component

We will want to create a component that we can use later on in the return of the main App component. We will call that Todo and it will pass in the (todo) and show the “text” part of the todo (todo.text), like so:

const Todo = ({ todo }) => <div className="todo">{todo.text}</div>;

Let’s see how we will use that in our App component.

Using Our Todo Variable to Get a List Returned

Go down to the return part of the App component and remove almost everything. We want to empty out that return part especially so that the spinning logo we saw earlier when we navigated to http://localhost:3000, goes away and we have our list being displayed on the page.

By using the JavaScript method, map(), we are able to create a new array of items by mapping over the todo items from state and displaying them by index.

Let’s create a pretty list of items:

src/App.js return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} /> ))} </div> </div> );

Navigate to your browser and you should see something like this:

Creating New Items to the To-Do List - Crud

Want to create a new item to the list? What if we forgot to add something to our to-do list and don’t want to forget that thing? Let’s give our application the power to Create a new item for our to-do app.

While in the src/App.js file, we are going to want to add a couple of things. At the top we are going to add another component, we’ll call it TodoForm. In this component we want it to:

  • Start with an empty state for an input field.
  • Be able to update the form by setting the state.
  • Handle the submit.
Setting our Empty State for the Form Input

Remember, we are using React Hooks so state is going to be a little different. To set our state we are going to want to write it like so:

const [value, setValue] = useState("");

The first is the "value" and the second is how we are going to be setting the state. The state starts off empty and as we add things to our state, it will add it to our list of to-do items.

We will want to add in a handleSubmit variable that can handle our addTodo function (we will make that function soon) and add the item to the list. If nothing is in the input box and the user presses “enter”, we want it to not do anything (i.e., not add in an empty tile to the list).

Adding that functionality into a form that has an input box, we should have our code look like this:

src/App.js function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); }

The addTodo function I told you about earlier? Let’s go ahead and build that now. Staying within App.js, under the state of the App component, the function should be able to grab the existing list of items, add on the new item, and display that new list.

src/App.js const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); };

Notice the lack of this.state.? With the new React Hooks, we have no more need to use that. Can you use it? Sure, of course. But the new Hooks allow for less typing, more efficiency, and understand that this.state. is going to be implied in certain places.

See that spread operator? The three dots before the todos, that is essentially "copying" the list for us so that we are able to add on the new to-do item. Then using our keyword that we set earlier, we will set the state with setTodos.

By using the TodoForm down in the return of the App component, we will see that input box pop up now. The entire src/App.js file should look like this so far:

src/App.js import React, { useState } from "react"; import "./App.css"; const Todo = ({ todo }) => <div className="todo">{todo.text}</div>; function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); } function App() { const [todos, setTodos] = useState([ { text: "Learn about React" }, { text: "Meet friend for lunch" }, { text: "Build really cool todo app" } ]); const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); }; return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} /> ))} <TodoForm addTodo={addTodo} /> </div> </div> ); } export default App;

Go to your browser and play around. You can now add in a to-do item to your list!

Updating Items in the To-Do List to be Completed - crUd

How would we want to update our to-do application? Maybe let’s have the functionality of being able to cross off an item. Let’s get started on that code!

Updating our State

Our state in our App component needs a little extra to it for the "Completed" status to be able to change. We will be adding in another key/value pair to our list of objects. By adding in an "isCompleted: false" value, we set that to false to begin with and will, when prompted, change that to true.

src/App.js const [todos, setTodos] = useState([ { text: "Learn about React", isCompleted: false }, { text: "Meet friend for lunch", isCompleted: false }, { text: "Build really cool todo app", isCompleted: false } ]);

We will need a function like the addTodo function but this one will be able to “complete” an item. We will want to do some similar things that we did in the addTodo like using the spread operator to grab the current list of items. In this function, we will be changing the isCompleted status to true so that it knows it has been completed. It will then update the state and set the state to the newTodos.

src/App.js const completeTodo = index => { const newTodos = [...todos]; newTodos[index].isCompleted = true; setTodos(newTodos); };

By using completeTodo in the Todo function, we are going to be able to fire off that functionality. When the “Complete” button is clicked, it will add in the textDecoration styling and cross-out the item. We are using a ternary operator, a feature within ES6 JavaScript, which is a simpler way of doing an if/else statement. This is our way of completing an item on the list and “updating” the list. The code should look as follows:

src/App.js function Todo({ todo, index, completeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> </div> </div> ); }

Dive down to the return of the App component and we’ll add in the following line:

completeTodo={completeTodo}

to look like this in the code:

src/App.js <div className="card-content"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} /> ))} </div>

Returning to the browser, your to-do app should look something like this when a “Complete” button is clicked.

Now we can read our list, add to our list, and update the completed status of each item. What’s left? Delete an item.

Deleting a To-Do Item - cruD

So you’ve completed an item on your to-do list, the day is over and you want to delete it from your list to be able to start over tomorrow. We are going to want to delete that item completely. Let’s see how we can get that going.

By adding just a couple lines, we will be able to add in the functionality of deleting an item.

We will go ahead and build the removeTodo function so that when we click on an “X” to delete an item, that will be fired off. That function will be located by the others underneath the state of the App component.

In this removeTodo function, we will again be using the spread operator but once we grab that current list, we will be "splicing" the chosen index off of the array of items. Once that is removed, we will return the new state by setting it with setTodos to be newTodos.

src/App.js const removeTodo = index => { const newTodos = [...todos]; newTodos.splice(index, 1); setTodos(newTodos); };

In your Todo function, you will want to add in this line:

<button onClick={() => removeTodo(index)}>x</button>

like this:

src/App.js function Todo({ todo, index, completeTodo, removeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> <button onClick={() => removeTodo(index)}>x</button> </div> </div> ); }

You’ll see as well that we are bringing in removeTodo at the top and then using it in the onClick of the “X”.

Adding in the removeTodo in the Todo part of the returning the App component, our “delete” will be fully functional. Add it in here:

src/App.js <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} removeTodo={removeTodo} />

With that added in, go to your browser and you’ll see a button with an "X" that when clicked, deletes the item completely.

The Final Product

The entire src/App.js file should look like this in the end:

src/App.js import React, { useState } from "react"; import "./App.css"; function Todo({ todo, index, completeTodo, removeTodo }) { return ( <div className="todo" style={{ textDecoration: todo.isCompleted ? "line-through" : "" }} > {todo.text} <div> <button onClick={() => completeTodo(index)}>Complete</button> <button onClick={() => removeTodo(index)}>x</button> </div> </div> ); } function TodoForm({ addTodo }) { const [value, setValue] = useState(""); const handleSubmit = e => { e.preventDefault(); if (!value) return; addTodo(value); setValue(""); }; return ( <form onSubmit={handleSubmit}> <input type="text" className="input" value={value} onChange={e => setValue(e.target.value)} /> </form> ); } function App() { const [todos, setTodos] = useState([ { text: "Learn about React", isCompleted: false }, { text: "Meet friend for lunch", isCompleted: false }, { text: "Build really cool todo app", isCompleted: false } ]); const addTodo = text => { const newTodos = [...todos, { text }]; setTodos(newTodos); }; const completeTodo = index => { const newTodos = [...todos]; newTodos[index].isCompleted = true; setTodos(newTodos); }; const removeTodo = index => { const newTodos = [...todos]; newTodos.splice(index, 1); setTodos(newTodos); }; return ( <div className="app"> <div className="todo-list"> {todos.map((todo, index) => ( <Todo key={index} index={index} todo={todo} completeTodo={completeTodo} removeTodo={removeTodo} /> ))} <TodoForm addTodo={addTodo} /> </div> </div> ); } export default App; What have we learned?

A to-do app can be a great reminder or starting point when it comes to CRUD in web development. Being able to read information, create new information, update existing information, and deleting said information can be powerful in any application.

React Hooks are great. They allow for a more straight-forward way of coding and can make your code clear and concise.

Now go have fun adding all your many to-do items to your newly built to-do app. And then have even more fun crossing them off when you finish them!

Happy coding!

Categories: Web Technologies

How to install Percona Server 5.7 on Debian/Ubuntu without a root password prompt

Tue, 11/13/2018 - 02:24
In the last few months, I had to install Percona Server 5.7 (PS5.7) on Debian a few times.  I was regularly annoyed by apt-get -y install percona-server-server-5.7 prompting me for a password.  But that annoyance did not push me to investigate the subject in detail: it was always a quick manual fix and Googling did not lead to a straightforward solution.  However in the last days, I had to
Categories: Web Technologies

Importing Data from MongoDB to MySQL using JavaScript

Tue, 11/13/2018 - 01:00

MySQL Shell 8.0.13 (GA) introduced a new feature to import JSON documents to MySQL in an easy way. The basics of this new feature were described in a previous blog post. Here, we provide more details about this feature, focusing on a practical use case that may be of interest for some readers: How to import JSON data from MongoDB to MySQL.…

Categories: Web Technologies

Auto-Refreshing Reports in MySQL Shell

Tue, 11/13/2018 - 00:38

MySQL Shell makes it easy to develop tools you can use for example to generate reports. In a previous blog, I showed how to use external modules in MySQL Shell. In this blog, I will take it one step further and use the curses Python module to create auto-refreshing reports. The first example will be kept very simple to show the idea, then a more realistic example will be shown where the top N files sorted by I/O will be returned.

Note: Out of the box, this does not work on Windows as Python does not ship with the curses library. Basic Example

As a basic example, consider the query SELECT NOW(). This returns the date and time. Let’s query that every second for 10 seconds, then return to the prompt. The easiest is to look at the example and then discuss what it does:

import curses from time import sleep def clock(): # Setup curses stdscr = curses.initscr() curses.noecho() curses.cbreak() stdscr.keypad(True) # Define the query query = session.sql("SELECT NOW()") # Clear screen stdscr.clear() # Run the clock for 10 seconds i = 0 while i < 10: i = i + 1 result = query.execute() now = result.fetch_one() stdscr.addstr(0, 0, "Now") stdscr.addstr(1, 0, "-------------------") stdscr.addstr(2, 0, str(now[0])) stdscr.refresh() sleep(1) # Reset the cures behaviour and finish curses.nocbreak() stdscr.keypad(False) curses.echo() curses.endwin() clock()

Tip: As MySQL Shell treats an empty line as the termination of a multi line block of code, ensure you have some white space on the blank lines in the definition of clock() if you are copying and pasting the code.

Inside the clock() function, curses is first set up to initialize the screen, not to echo key inputs, and to react to keys instantly (don’t wait for enter to be hit). The latter is not needed here as there is nothing checking for key inputs, however in many cases (like the iotop example later in the blog), you want to listen for single key inputs, for example to stop the execution. The counter part of these steps are done at the end of the function to clean up.

Next the query that will be executed is defined. Here I take advantage of the X DevAPI’s support for preparing a query and use and re-use it later. This way, the query is only defined in a single spot. Then the screen is cleared and everything is ready for the loop that will do the main part of the job.

The loop in this example is very simple. It just goes through 10 iterations with a one second sleep at the end of each loop. In a real case, you may want to make this more resilient to delays or add another kinds of logic. The query is executed and the single row in the result is fetched. Finally, the addstr() curses method is used to print the output in the desired location (upper left corner in this case).

When you execute the function, you will get a result like in the following screen shot:

Using MySQL Shell as a clock

This is all fine, but using MySQL Shell to show a clock is hardly the most interesting use of MySQL Shell. Let’s look at an example that is more usable in the real World.

MySQL iotop

If you are a Linux user, you may know the iotop utility which allows you to monitor the I/O activity in a similar way to what the top command does CPU and memory usage for processes. Let’s implement a basic MySQL my_iotop module with the by_file_by_bytes function that displays the N (default 10) MySQL files that have had the most I/O, refreshes with a specified interval (default 10 seconds), and continues until you hit the q key.

This may sound like a complex task, however most of the steps are the same as in the previous example. The top N files with respect to I/O can be found using the global_io_global_by_file_by_bytes view from the sys schema. This sorts by the total number of bytes read and written for the file in descending order. So, only a simple single view query is needed. For this example to avoid the output handling to be too complex, fixed width columns will be used and file paths longer than 50 characters are truncated.

The only thing that really is required that was not part of the previous example is to add the limit to the number of files to include and to detect when q is entered. The limit is however easy to add when using the select() CRUD method.

Another thing that is worth doing is to include the feature in an external module. This makes it easier to reuse. This requires you to add the directory where you will save your modules to the ~/.mysqlsh/mysqlshrc.py file, for example:

import sys sys.path.append("/home/jesper/.mysqlsh/lib")

See also my previous blog on using external modules for more information.

In this example the file with the source code is called my_iotop.py stored in the directory added to the mysqlshrc.py file.

Warning: The following code is devoid of error handling. If you intend to use this in production please validate the input and check for errors.

The complete source is:

import curses from datetime import datetime from time import sleep def global_by_file_by_bytes(session, max_files=10, delay=10): # Define the output format fmt_header = "| {0:50s} | {1:12s} | {2:13s} | {3:13s} | {4:12s} " \ + "| {5:13s} | {6:13s} | {7:13s} | {8:8s} |" header = fmt_header.format("File", "# Reads", "Bytes Read", "Avg. Read", "# Writes", "Bytes Write", "Avg. Write", "Bytes Total", "Write %") bar = "+" + "-" * 52 + "+" + "-" * 14 + "+" + "-" * 15 + "+" + "-" * 15 \ + "+" + "-" * 14 + "+" + "-" * 15 + "+" + "-" * 15 + "+" + "-" * 15 \ + "+" + "-" * 10 + "+" fmt_row = "| {0:50.50s} | {1:12d} | {2:13s} | {3:13s} | {4:12d} " \ + "| {5:13s} | {6:13s} | {7:13s} | {8:8s} |" # Setup curses stdscr = curses.initscr() curses.start_color() curses.init_pair(1, curses.COLOR_RED, curses.COLOR_WHITE) curses.noecho() curses.cbreak() # Listing for 1/10th of second at a time curses.halfdelay(1) stdscr.keypad(True) # Define the query sys_schema = session.get_schema("sys") table = sys_schema.get_table("io_global_by_file_by_bytes") query = table.select().limit(max_files) # Clear screen stdscr.clear() # Run the query and generate the report keep_running = True while keep_running: time = datetime.now() result = query.execute() stdscr.addstr(0, 0, time.strftime('%A %-d %B %H:%M:%S'), curses.color_pair(1)) stdscr.addstr(2, 0, bar) stdscr.addstr(3, 0, header) stdscr.addstr(4, 0, bar) # Print the rows in the result line = 5 for row in result.fetch_all(): stdscr.addstr(line, 0, fmt_row.format(*row)) line = line + 1 stdscr.addstr(line, 0, bar) stdscr.refresh() # Wait until delay seconds have passed while listening for the q key while (datetime.now() - time).total_seconds() < delay: c = stdscr.getch() if c == ord("q"): keep_running = False break # Reset the cures behaviour and finish curses.nocbreak() stdscr.keypad(False) curses.echo() curses.endwin()

The example are using a few more of the curses features which I will not go into detail with. I will recommend you to read the Python documentation and the Curses Programming with Python tutorial, if you are interested in learning more about using curses.

You start the report by calling the global_by_file_by_bytes() function. The session for the connection is required as an argument. Optional arguments are the delay between each iteration (delay) and the maximum number of files to include in the report (max_files). An example using a delay of 5 seconds is:

mysql-py> import my_iotop mysql-py> my_iotop.global_by_file_by_bytes(session, delay=5)

While the implementation shown here is quite rough in its edges, it does show the potential. And remember you have the full Python language available for manipulating the data. Click on the video below to see an example of the report.

https://mysql.wisborg.dk/wp-content/uploads/2018/11/shell_my_iotop.mp4
Categories: Web Technologies

Innotop for MySQL 8.0

Mon, 11/12/2018 - 13:52

Recently, I did some maintenance on Innotop and I merged several pull requests including one to support MySQL 8.0 (thank you yoku0825 for the contribution).

As you know, Innotop is written in Perl and to be able to use it with MySQL 8.0, you need to have Perl DBD for MySQL compatible with MySQL 8.0 too (a driver that support the new default authentication plugin, caching_sha2_password).

If you are using Fedora 29, I created such package and you can download it here: perl-DBD-MySQL-4.046-5.mysql8.fc29.x86_64

We can see the difference between the default package and this one.

Old package:

$ ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so ... libmariadb.so.3 => /lib64/libmariadb.so.3 (0x00007f08f026d000) ... libssl.so.1.1 => /lib64/libssl.so.1.1 (0x00007f08f0033000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f08efd57000) ...

New package:

$ ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.so ... libmysqlclient.so.21 => /usr/lib64/mysql/libmysqlclient.so.21 (0x00007fa3ad611000) ... libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007fa3ad1ae000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fa3ad1a6000) ...

Now you can use again Innotop with MySQL 8.0 !


Thank you again to all contributors and don’t hesitate to submit your patches.

Categories: Web Technologies

Percona Server for MySQL 5.7.23-24 Is Now Available

Mon, 11/12/2018 - 10:52

Percona announces the release of Percona Server for MySQL 5.7.23-24 on November 12, 2018 (downloads are available here and from the Percona Software Repositories). This release merges changes of MySQL 5.7.23, including all the bug fixes in it. Percona Server for MySQL 5.7.23-24 is now the current GA release in the 5.7 series. All of Percona’s software is open-source and free.

This release introduces InnoDB encryption improvements and merges upstream MyRocks changes. Also, we’ve improved the usage of column families in MyRocks. The InnoDB encryption improvements are in Alpha quality and we don’t recommend that they are used in production.

New Features Bugs Fixed
  • #4723: PURGE CHANGED_PAGE_BITMAPS did not work when innodb_data_home_dir was used
  • #4937: rocksdb_update_cf_options was ignored when specified in my.cnf or on the command line
  • #1107: The binlog could be corrupted when tmpdir got full
  • #4834: The encrypted system tablespace could have an empty uuid

Other bugs fixed

  • #4106: “Assertion `log.getting_synced’ failed in rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&)“
  • #4930: “main.percona_log_slow_innodb: Result content mismatch”
  • #4811: “5.7 Merge and fixup for old DB-937 introduces possible regression”
  • #4705: “crash on snapshot size check in RocksDB”

Find the release notes for Percona Server for MySQL 5.7.23-24 in our online documentation. Report bugs in the Jira bug tracker.

Categories: Web Technologies

Vitess Weekly Digest 11/12/18

Mon, 11/12/2018 - 09:36
We continue the digest from the Slack discussions for Sep 22 2018 to Oct 5 2018. We've fallen slightly behind on this, but will catch up again soon.

Enable VtGateExecute in vtctld Arsalan [Sep 22nd] Hi, I want to query vitess on vtgate but I have below error. How can i fix this problem?
vitess@vtctldv3-hrl74:/$ vtctlclient -server 10.5.61.20:16999  VtGateExecute -server 10.5.61.21:16991 "show tables" E0923 05:14:55.169771    1102 main.go:61] Remote error: rpc error: code = Unknown desc = query commands are disabled (set the -enable_queries flag to enable)
sougou you need to add this flag to vtctld:  -enable_queries         if set, allows vtgate and vttablet queries. May have security implications, as the queries will be run from this process.
Multi-column Primary Key sjmudd [Sep 26th] Question: if a mysql table has a multi-column primary key, can Vitess handle this ? I vaguely remember the answer might be “no”. but looking in vschema_test.go I think that may not be the case.
sougou multi-column pk should work. What we don't fully support yet are multi-column vindexes. there's partial support for multi-column lookup vindexes, where we write multiple columns, but only use the first column for computing the keyspace id.
Vitess wins InfoWorld Bossie award sougou [Sep 26th] "The best open source software for data storage and analytics": https://www.infoworld.com/article/3306454/big-data/the-best-open-source-software-for-data-storage-and-analytics.html#slide9
Also, @acharis and @tpetr have been accepted to present vitess at kubecon: https://kccna18.sched.com/event/Gras/vitess-at-hubspot-how-we-moved-hundreds-of-mysql-databases-into-kubernetes-tom-petr-alexander-charis-hubspot
Splitting sharded and unsharded tables Mark Solters [Sep 27th] If one were to shard foo/0 to baz/{-80,80-}, would the SplitClone foo/0 syntax have to be modified to reflect that the destination shards are no longer in the source keyspace? Inspecting the command seems to suggest that there is no “destination shards” option for SplitClone.  In this case, my question is better phrased as “How can we horizontally reshard one keyspace into another?”  It seems the missing part is convincing foo/0 that it “overlaps” with baz/-80,80-. I think the correct answer here is that I should be vertically sharding.
sougou Yeah. you first have to vertical shard from foo/0 to baz/0, and then reshard baz to baz/{...}
Mark Solters I see. Is there a guide or set of instructions for the vertical sharding procedure?
sougou It's one of the top items that we need to document. In the meantime, you can look at my scripts. Steps are similar to horizontal sharding: https://gist.github.com/sougou/e6259e958b5707d5888a5ac406418cc3
Mark Solters Very interesting! I’ve basically done these steps minus this CreateKeyspace --served_from 'master:test_keyspace,replica:test_keyspace,rdonly:test_keyspace' target Is that step strictly necessary?  If I have already copied over the relevant CopySchemaShard, and baz already has a functional master etc. can I simply run VerticalCloneSplit? i ask because i notice VerticalCloneSplit specifies the destination but not the source.
(The thread was forgotten at this point) Answer: The specific step is strictly necessary for SplitClone. It's currently a rigid workflow that we plan to make more flexible in the future.
Show tables of a keyspace mgersh [Sep 27th] is there a query to show all tables for a certain keyspace? we are using the jdbc driver "show tables from $keyspace" just returns the default keyspace's tables
leoxlin If you connect to a specific Keyspace, you can use SHOW TABLES;
If pods crash, check resource limits
mshivanna [Oct 1st] hi we recently spun up vitess cluster now we want to import a mysql dump. what is the right way to do it? We execed into the master node (vttablet-100) mysql container and did a usual mysql import but the vttablet pod kept crashin after sometime. The dump is 13Gb in size. 
sougou Can you look at the reason why the pod was terminated? Most likely, you ran out of some resource.
mshivanna yeah i have increased the cpu its not failing quite often but its failing will increase again. will update here thank you @sougou
Ameet presents at Velocity conference, NY ameet [Oct 2nd] Hi Vitessians, I just finished my talk about _Moving Slack's database to Vitess_ at Velocity Conference, NY. Here's the link to the talk and slides: https://conferences.oreilly.com/velocity/vl-ny/user/proposal/status/69885 The response was great. There was lots of interest about Vitess.
Improved control on resharding young [Oct 5th] Is there an open issue for encapsulating resetting vreplication in one command? It's quite annoying to do manually.
sougou i'm wokring on a PR to perform high level commands for canceling resharding etc.
UPDATE:  these features are now done.
Understanding keyspace ids and keyranges Mark Solters [Oct 5th] given a shard like 80- what is the smallest keyspace ID that would be mapped to this shard? is it 0x80? 0x81? 0x800? 0x8000? (i am trying to construct a numeric_static_map) i know that we use a syntax that makes the RHS zeros “optional” but I also think that the center of the keyspace can’t be as small a number as 0x80. (!) is what we write as 0x80 actually base-10 9223372036854775808?
sougou the keyspace id is a buch of bits (left justified) there is theoretically no limit to the number of bytes it can have it may get confusing if you try to see it as a number because numbers are right justified in some cases, we 'encode' a number as a keyspace id. If so, we set the length as 64 bits. it's just a convention. One could have used 128 bits or 32 bits, or whatever.
Mark Solters hmm, OK so how then does one construct a numeric_static_map? the vitess repo contains a numeric_static_map_test.json where it is simply mapping some example primary keys to integers like 1 and 2, which if i'm reading this correctly, are the keyspace IDs im basically confused about how this map here translates to shards: https://github.com/vitessio/vitess/blob/master/data/test/vtgate/numeric_static_map_test.json {     "1": 1,     "3": 2 }
sougou Oops. Sorry about the tangent. Yeah. In your case, you're right. the 922.. number is the lowest number for shard 80- this is because numeric_static_map converts the number into a 64-bit keyspace id.
Mark Solters yes, and i take your point about the question being ambiguous without something grounding the bit size hmmmm so, for this (trivial) example to work i guess id have to construct a shard like -000000000000002 which would have an effective key size of 1
sougou right. that shard can have 2 ids (0 & 1)
Meaning of target in the Go driver derekperkins [Oct 5th] @sougou what is a target in the go driver? https://github.com/vitessio/vitess/blob/master/go/vt/vitessdriver/driver.go#L129
sougou best documentation is the test https://github.com/vitessio/vitess/blob/master/go/vt/vitessdriver/driver_test.go
derekperkins ah, thanks and leaving it empty presumably points to master?
sougou yeah well if you didn't override the default tablet type in vtgate
Categories: Web Technologies

Another shows this week with MySQL!!!

Mon, 11/12/2018 - 03:42

We are happy to confirm that we are ready for another two shows which will take place this week. One in the US second in Europe. Please find details below:

  • PHP[World]
    • Place: Washington DC, US
    • Date: November 14-15, 2018
    • MySQL Community team is a Workshop sponsor here with David Stokes, the MySQL Community Manager as MySQL representative on site. 
  • BGOUG
    • Place: Pravets, Bulgaria
    • Date: November 16, 2018
    • MySQL sponsors & attends this show for years. This year Georgi Kodinov, the Senior Software Developer Manager for MySQL will have a talk on "Data Masking in MySQL Enterprise 5.7 and 8". The talk is scheduled for Friday, Nov 16 @17:30-18:15 in DB session. See full schedule here.

We are looking forward to seeing & talking to you there!!

 

 

 

 

 

Categories: Web Technologies

Pages

1 2 3 4 5 6 7 8 9 next › last »