|
| 1 | +//// |
| 2 | +Make sure to rename this file to the name of your repository and add the filename to the README. This filename must not conflict with any existing tutorials. |
| 3 | +//// |
| 4 | + |
| 5 | +// Describe the title of your article by replacing 'Tutorial template' with the page name you want to publish. |
| 6 | += Generating Streaming data using SQL |
| 7 | +// Add required variables |
| 8 | +:page-layout: tutorial |
| 9 | +:page-product: cloud // Required: Define the product filter for this tutorial. Add one of the following: platform, imdg, cloud, operator |
| 10 | +:page-categories: Stream Processing, Get Started, SQL // Optional: Define the categories for this tutorial. Check the current categories on the tutorials homepage (https://docs.hazelcast.com/tutorials/). Add one or more of the existing categories or add new ones as a comma-separated list. Make sure that you use title case for all categories. |
| 11 | +:page-lang: sql // Optional: Define what Hazelcast client languages are supported by this tutorial. Leave blank or add one or more of: java, go, python, cplus, node, csharp. |
| 12 | +:page-enterprise: // Required: Define whether this tutorial requires an Enterprise license (true or blank) |
| 13 | +:page-est-time: 10 mins // Required: Define the estimated number of time required to complete the tutorial in minutes. For example, 10 mins |
| 14 | +:description: Use SQL on Hazelcast to generate randomized streaming data for demo/POC purposes. |
| 15 | +// Required: Summarize what this tutorial is about in a sentence or two. What you put here is reused as the tutorial's first paragraph and included in HTML description tags. Start the sentence with an action verb such as 'Deploy' or 'Connect'. |
| 16 | + |
| 17 | +{description} |
| 18 | + |
| 19 | +// Give some context about the use case for this tutorial. What will the reader learn? |
| 20 | +== Context |
| 21 | +In this tutorial, you will learn how to use SQL to generate streaming data locally to Hazelcast. |
| 22 | + |
| 23 | +Using the `VIEW` and `generate-stream` functions of SQL on Hazelcast, you can create a data stream locally within Hazelcast. |
| 24 | + |
| 25 | +As of Hazelcast 5.3, you can: |
| 26 | + |
| 27 | +* Access this data directly via SQL functionality in any programming language. |
| 28 | + |
| 29 | +* Direct the output to an external Kafka server, then use the connection to the Kafka server to access the data via the pipeline API or via SQL using `CREATE MAPPING`. |
| 30 | + |
| 31 | +[NOTE] |
| 32 | +==== |
| 33 | +In an upcoming release of Hazelcast, the Kafka Connect connector will expose SQL, eliminating the need for an external Kafka server to feed data into the pipeline API. |
| 34 | +==== |
| 35 | + |
| 36 | +// Optional: What does the reader need before starting this tutorial? Think about tools or knowledge. Delete this section if your readers can dive straight into the lesson without requiring any prerequisite knowledge. |
| 37 | +== Before you Begin |
| 38 | + |
| 39 | +Before starting this tutorial, make sure that you meet the following prerequisites: |
| 40 | + |
| 41 | +* Running cluster of Hazelcast |
| 42 | +* Connection to SQL command line, either through CLC or through Management Center |
| 43 | +* (Optional) a Kafka instance accessible by your Hazelcast cluster |
| 44 | + |
| 45 | + |
| 46 | +== Step 1. Generating Data |
| 47 | + |
| 48 | +//// |
| 49 | +Introduce what your audience will learn in each step, then continue to write the steps in the tutorial. |
| 50 | +You can choose one of these approaches to write your tutorial part: |
| 51 | +
|
| 52 | +* In a narrative style if your parts are short or you are using screenshots to do most of the talking. |
| 53 | +* In a "Goal > Steps > Outcome" structure to build a predictable flow in all your tutorial parts. |
| 54 | +
|
| 55 | +Whatever option you choose when designing your tutorial should be carried through in subsequent parts. |
| 56 | +//// |
| 57 | + |
| 58 | +The following code is from the link:https://docs.hazelcast.com/tutorials/SQL-Basics-on-Viridian[SQL Basics on Viridian (Stock Ticker Demo) tutorial]. The comments break down what each part of the code is doing. |
| 59 | + |
| 60 | +```sql |
| 61 | +CREATE OR REPLACE VIEW trades AS |
| 62 | + SELECT id, |
| 63 | + |
| 64 | + CASE WHEN tickRand BETWEEN 0 AND 0.1 THEN 'APPL' |
| 65 | + WHEN tickRand BETWEEN 0.1 AND 0.2 THEN 'GOOGL' |
| 66 | + WHEN tickRand BETWEEN 0.2 AND 0.3 THEN 'META' |
| 67 | + WHEN tickRand BETWEEN 0.3 AND 0.4 THEN 'NFLX' |
| 68 | + WHEN tickRand BETWEEN 0.4 AND 0.5 THEN 'AMZN' |
| 69 | + WHEN tickRand BETWEEN 0.5 AND 0.6 THEN 'INTC' |
| 70 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN 'CSCO' |
| 71 | + WHEN tickRand BETWEEN 0.7 AND 0.8 THEN 'BABA' |
| 72 | + ELSE 'VOO' |
| 73 | + END as ticker, <1> |
| 74 | + |
| 75 | + CASE WHEN tickRand BETWEEN 0 and 0.1 then tickRand*50+1 |
| 76 | + WHEN tickRand BETWEEN 0.1 AND 0.2 THEN tickRand*75+.6 |
| 77 | + WHEN tickRand BETWEEN 0.2 AND 0.3 THEN tickRand*60+.2 |
| 78 | + WHEN tickRand BETWEEN 0.3 AND 0.4 THEN tickRand*30+.3 |
| 79 | + WHEN tickRand BETWEEN 0.4 AND 0.5 THEN tickRand*43+.7 |
| 80 | + WHEN tickRand BETWEEN 0.5 AND 0.6 THEN tickRand*100+.4 |
| 81 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN tickRand*25+.8 |
| 82 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN tickRand*80+.5 |
| 83 | + WHEN tickRand BETWEEN 0.7 AND 0.8 THEN tickRand*10+.1 |
| 84 | + ELSE tickRand*100+4 |
| 85 | + END as price,<2> |
| 86 | + |
| 87 | + trade_ts, |
| 88 | + amount |
| 89 | +FROM |
| 90 | + (SELECT v as id, |
| 91 | + RAND(v*v) as tickRand,<3> |
| 92 | + TO_TIMESTAMP_TZ(v*10 + 1645484400000) as trade_ts, <4> |
| 93 | + ROUND(RAND()*100, 0) as amount |
| 94 | + FROM TABLE(generate_stream(100))); <5> |
| 95 | +``` |
| 96 | +<1> We're using the random number to generate different stock ticker symbols. |
| 97 | +<2> To keep each ticker's price within a reasonable range of variation, we use the same `BETWEEN` ranges, and give each one a different base multiplier. |
| 98 | +<3> The random number generator creates the `tickRand` value. |
| 99 | +<4> We seed the timestamp with a base value that equates to 21 Feb 2022. You can change this to any reasonable Unix timestamp. |
| 100 | +<5> The `generate_stream` function is what makes this all work. In this example, we're generating 100 events per second. |
| 101 | + |
| 102 | +Once this view is created, you can access it via SQL. Because you're looking at streaming data, you'll need to use CTRL-C to stop each query. |
| 103 | + |
| 104 | +```sql |
| 105 | +SELECT * from trades; |
| 106 | + |
| 107 | +SELECT ticker AS Symbol, ROUND(price,2) AS Price, amount AS "Shares Sold" |
| 108 | +FROM trades; |
| 109 | +``` |
| 110 | + |
| 111 | + |
| 112 | + |
| 113 | +== Step 2. Inserting into Kafka |
| 114 | + |
| 115 | +//// |
| 116 | +Continue the design approach you chose in the previous part and continue it through to the end of the tutorial. |
| 117 | +//// |
| 118 | + |
| 119 | +You can send generated data to Kafka. Kafka will store and replay it as it would data from any other streaming source. Instead of creating a view local to Hazelcast, you'll create a mapping within SQL for the Kafka topic, then use the `INSERT` function to send generated data to that topic. |
| 120 | + |
| 121 | +. First, create a mapping for the data. Include all the fields that you'll generate with SQL. This creates a topic in Kafka as well as m |
| 122 | ++ |
| 123 | +```sql |
| 124 | +CREATE or REPLACE MAPPING trades ( |
| 125 | + id BIGINT, |
| 126 | + ticker VARCHAR, |
| 127 | + price DECIMAL, |
| 128 | + trade_ts TIMESTAMP WITH TIME ZONE, |
| 129 | + amount BIGINT) |
| 130 | +TYPE Kafka |
| 131 | +OPTIONS ( |
| 132 | + 'valueFormat' = 'json-flat', |
| 133 | + 'bootstrap.servers' = 'broker:9092' |
| 134 | +); |
| 135 | +``` |
| 136 | +. Next, use the `INSERT` function to send the data to the `trades` topic you just created. |
| 137 | +```sql |
| 138 | +INSERT INTO trades |
| 139 | + SELECT id, |
| 140 | + CASE WHEN tickRand BETWEEN 0 AND 0.1 THEN 'APPL' |
| 141 | + WHEN tickRand BETWEEN 0.1 AND 0.2 THEN 'GOOGL' |
| 142 | + WHEN tickRand BETWEEN 0.2 AND 0.3 THEN 'META' |
| 143 | + WHEN tickRand BETWEEN 0.3 AND 0.4 THEN 'NFLX' |
| 144 | + WHEN tickRand BETWEEN 0.4 AND 0.5 THEN 'AMZN' |
| 145 | + WHEN tickRand BETWEEN 0.5 AND 0.6 THEN 'INTC' |
| 146 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN 'CSCO' |
| 147 | + WHEN tickRand BETWEEN 0.7 AND 0.8 THEN 'BABA' |
| 148 | + ELSE 'VOO' |
| 149 | + END as ticker, |
| 150 | + CASE WHEN tickRand BETWEEN 0 and 0.1 then tickRand*50+1 |
| 151 | + WHEN tickRand BETWEEN 0.1 AND 0.2 THEN tickRand*75+.6 |
| 152 | + WHEN tickRand BETWEEN 0.2 AND 0.3 THEN tickRand*60+.2 |
| 153 | + WHEN tickRand BETWEEN 0.3 AND 0.4 THEN tickRand*30+.3 |
| 154 | + WHEN tickRand BETWEEN 0.4 AND 0.5 THEN tickRand*43+.7 |
| 155 | + WHEN tickRand BETWEEN 0.5 AND 0.6 THEN tickRand*100+.4 |
| 156 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN tickRand*25+.8 |
| 157 | + WHEN tickRand BETWEEN 0.6 AND 0.7 THEN tickRand*80+.5 |
| 158 | + WHEN tickRand BETWEEN 0.7 AND 0.8 THEN tickRand*10+.1 |
| 159 | + ELSE tickRand*100+4 |
| 160 | + END as price, |
| 161 | + trade_ts, |
| 162 | + amount |
| 163 | +FROM |
| 164 | + (SELECT v as id, |
| 165 | + RAND(v*v) as tickRand,<3> |
| 166 | + TO_TIMESTAMP_TZ(v*10 + 1645484400000) as trade_ts, <4> |
| 167 | + ROUND(RAND()*100, 0) as amount |
| 168 | + FROM TABLE(generate_stream(100))); <5> |
| 169 | +``` |
| 170 | +The code to generate the data is exactly the same; the only difference is that we're sending it to Kafka instead of creating a local view. |
| 171 | + |
| 172 | +== Summary |
| 173 | + |
| 174 | +//// |
| 175 | +Summarise what knowledge the reader has gained by completing the tutorial, including a summary of each step's goals (this is a good way to validate whether your tutorial has covered all you need it to.) |
| 176 | +//// |
| 177 | +You can now use SQL on Hazelcast to generate streaming data for testing/demo purposes. |
| 178 | + |
0 commit comments