Skip to content

Commit a19d1db

Browse files
committed
Updated README, Article_Gather files
Updated notebook markdown description.
1 parent c751f62 commit a19d1db

7 files changed

Lines changed: 72 additions & 88 deletions

article.md renamed to Article_Gather.md

Lines changed: 70 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,11 @@ Before proceeding, I would like to briefly recap the synchronous and asynchronou
3737

3838
**Synchronous** code runs tasks one at a time — each request must complete before the next one starts. The program blocks and waits at every I/O-bound call, so if a request takes 60 seconds, nothing else runs for those 60 seconds. Fine for a single request, but a real bottleneck when fetching data with many calls.
3939

40-
![synchronous](images/02_synchronous_simple.png)
40+
![synchronous](images/01_synchronous_simple.png)
4141

4242
**Asynchronous** code lets multiple tasks run concurrently. While one request is waiting for a network response, the event loop hands control to the next task instead of sitting idle.
4343

44-
![asynchronous](images/04_asynchronous_simple.png)
44+
![asynchronous](images/02_asynchronous_simple.png)
4545

4646
The real payoff comes when you have **many requests to make**. With `asyncio.gather()` and `asyncio.TaskGroup()`, all requests are fired concurrently so the total time is roughly that of the single slowest response — not the sum of all response times.
4747

@@ -65,7 +65,7 @@ Please find more detail regarding the Data Platform HTTP error status messages f
6565

6666
The Historical Pricing endpoint rate limits information is available on the **Reference** tab of the [Data Platform API Playground](https://apidocs.refinitiv.com/Apps/ApiDocs) page. The current rate limits (**As of Mar 2026**) is as follows:
6767

68-
![historical rate limit](images/05_historical-pricing-ratelimits.png)
68+
![historical rate limit](images/03_historical-pricing-ratelimits.png)
6969

7070
## Prerequisite
7171

@@ -88,13 +88,29 @@ Make sure you have the following set up:
8888

8989
Please your LSEG representative or account manager for the Data Platform Access.
9090

91+
That’s all I have to say about this article and example code prerequisite.
92+
93+
## Access Layer get_history vs Content Layer historical_pricing
94+
95+
That brings us to a big question, why use Content Layer Historical Pricing rather than `get_history` method?
96+
97+
The `get_history` method is part of the Library *Access Layer*. It is simple and convenient, but synchronous. Calls block execution until complete.
98+
99+
The `historical_pricing` module is part of the *Content Layer*. The Content Layer allows developers to access the same content as Access Layer which are a more flexible manner:
100+
101+
- Richer and fuller responses where available.
102+
- Asynchronous and event-driven modes in addition to synchronous usage.
103+
- Logical content modules for market data domains such as Level 1 Market Price Data (snapshot/streaming), News, Historical Pricing and so on.
104+
105+
The module lets developers set historical data query via *definition* then get data via synchronous `get_data` and asynchronous `get_data_async` methods. I am focusing on the asynchronous `get_data_async` method of the Historical Pricing module here.
106+
91107
## Code Walkthrough
92108

93109
Now we come to the code walkthrough. This article focuses primarily on the asynchronous code.
94110

95111
The first step is to import the required libraries. The main libraries are `lseg.data` and `asyncio`.
96112

97-
## Import Required Libraries
113+
### Import Required Libraries
98114

99115
```python
100116
import os
@@ -110,26 +126,12 @@ import pandas as pd
110126
pd.set_option("future.no_silent_downcasting", True)
111127
```
112128

113-
## Load Credentials from .env
114-
115-
Use [python-dotenv](https://pypi.org/project/python-dotenv/) to load credentials from .env.
116-
117-
Note: The .env file should not be committed to version control.
118-
119-
```python
120-
# Load environment variables from .env file
121-
load_dotenv(dotenv_path='.env')
122-
# Retrieve Platform Session credentials from environment variables
123-
app_key = os.getenv('LSEG_API_KEY')
124-
username = os.getenv('LSEG_MACHINE_ID')
125-
password = os.getenv('LSEG_PASSWORD')
126-
```
127129

128-
## Open a Platform Session
130+
### Open a Platform Session
129131

130-
Moving on to the next step,
132+
Moving on to the next step, create a Data Library session object to authenticate, manage the connection, and retrieve data.
131133

132-
Create a Data Library session object to authenticate, manage the connection, and retrieve data.
134+
The code below gets the Data Platform credential from the OS environment variables. You can use the [python-dotenv](https://pypi.org/project/python-dotenv/) library to load credentials from `.env` file as well.
133135

134136
```python
135137

@@ -152,9 +154,14 @@ session.set_default(ld_session)
152154

153155
# Open the connection to the LSEG Data Platform
154156
ld_session.open()
157+
#
155158
```
156159

157-
## Declare Instruments and Request Parameters
160+
If the library can open the session successfully, you should see the **<OpenState.Opened: 'Opened'>** output message.
161+
162+
The next step is creating the data request variables such as dictionary of company RICs and Name, request fields, etc.
163+
164+
### Declare Instruments and Request Parameters
158165

159166
```python
160167
# -- Instrument universe --------------------------------------------------------
@@ -164,28 +171,7 @@ INSTRUMENTS = {
164171
"MSFT.O": "Microsoft",
165172
"AMZN.O": "Amazon",
166173
"GOOG.O": "Alphabet",
167-
"AVGO.O": "Broadcom",
168-
"META.O": "Meta",
169-
"ORCL.N": "Oracle",
170-
"IBM.N": "IBM",
171-
"PLTR.O": "Palantir",
172-
"NFLX.O": "Netflix",
173-
"TSLA.O": "Tesla",
174-
"CRM.N": "Salesforce",
175-
"AMD.O": "AMD",
176-
"INTC.O": "Intel",
177-
"ARM.O": "Arm Holdings",
178-
"TXN.O": "Texas Instruments",
179-
"CSCO.O": "Cisco Systems",
180-
"WMT.O": "Walmart",
181-
"LLY.N": "Eli Lilly and Company",
182-
"JPM.N": "JPMorgan Chase & Co.",
183-
"XOM.N": "Exxon Mobil Corporation",
184-
"V.N": "Visa Inc.",
185-
"JNJ.N": "Johnson & Johnson",
186-
"MU.O": "Micron Technology, Inc.",
187-
"MA.N": "Mastercard Incorporated",
188-
"COST.O": "Costco Wholesale Corporation",
174+
# ....
189175
"CVX.N": "Chevron Corporation",
190176
"BAC.N": "Bank of America Corporation",
191177
"CAT.N": "Caterpillar Inc.",
@@ -207,37 +193,27 @@ INTRADAY_FIELDS = ["TRDPRC_1", "BID", "ASK"]
207193
INTERDAY_FIELDS = ["BID", "ASK", "OPEN_PRC", "HIGH_1", "LOW_1", "TRDPRC_1", "NUM_MOVES", "TRNOVR_UNS"]
208194
```
209195

210-
## Access Layer get_history vs Content Layer historical_pricing
211-
212-
Why use Content Layer Historical Pricing rather than get_history?
213-
214-
The get_history method is part of the Access Layer. It is simple and convenient, but synchronous. Calls block execution until complete.
196+
### Using asyncio.gather
215197

216-
The historical_pricing module is part of the Content Layer and offers:
198+
That brings us to the most to the most direct and easiest way to request historical data concurrently, combine Historical Pricing `get_data_async` calls with [`asyncio.gather(*aws)`](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) method.
217199

218-
- Richer and fuller responses where available.
219-
- Asynchronous and event-driven modes in addition to synchronous usage.
220-
- Logical content modules for market data domains.
221-
222-
Historical Pricing uses definition objects and supports both get_data (sync) and get_data_async (async).
200+
**await asyncio.gather(*aws, return_exceptions=False)**
223201

224-
## Using asyncio.gather
202+
- Runs [awaitable objects](https://docs.python.org/3/library/asyncio-task.html#asyncio-awaitables) in the `aws` sequence concurrently.
203+
- If all awaitables succeed, it returns a Python list of results in the same order as `aws`.
204+
- `return_exceptions` controls how exceptions are handled:
205+
- If `False` (default): the first exception is raised immediately to the caller waiting on `gather()`. Other awaitables are not automatically cancelled and may continue running.
206+
- If `True`: exceptions are returned in the result list (instead of being raised immediately), alongside successful results.
225207

226-
A direct way to request data concurrently is combining Historical Pricing get_data_async calls with [asyncio.gather(*aws)](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather).
208+
In default mode (`return_exceptions=False`), your code may stop at the first error and not automatically collect outcomes from the other still-running awaitables. This can leave unfinished or uncollected task outcomes that are easy to miss. To handle this pattern safely, an application must keep task references and explicitly inspect task status/results when needed manually.
227209

228-
await asyncio.gather(*aws, return_exceptions=False)
210+
That is why many applications use `asyncio.gather(..., return_exceptions=True)` when they need complete visibility of both success and failure results in one place.
229211

230-
- Runs awaitables concurrently.
231-
- Returns results in the same order as inputs when all succeed.
232-
- return_exceptions controls failure behavior:
233-
- False (default): first exception is raised immediately.
234-
- True: exceptions are returned in the results list.
212+
In this example, I use `historical_pricing.events.Definition`, which returns Historical Pricing Events data similar to the Data Platform `/data/historical-pricing/v1/views/events/` endpoint.
235213

236-
For full visibility of both successes and failures, return_exceptions=True is often preferred for batch workloads.
214+
The first step is to define a `display_response` method to display returned historical data as a DataFrame.
237215

238-
In this example, the code uses historical_pricing.events.Definition.
239-
240-
## Helper: Display Responses Safely
216+
### Helper: Display Responses Safely
241217

242218
```python
243219
def display_response(data):
@@ -269,9 +245,26 @@ def display_response(data):
269245
print(f"Request failed - HTTP status: {api_response.http_status}")
270246
```
271247

272-
Compared to simpler examples that only check response success, this helper also handles Python exceptions returned by gather(..., return_exceptions=True).
273248

274-
## Request Data with gather: Events Example
249+
You may notice that the `display_response` method above is more defensive than the one used in [EX-2.01.02-HistoricalPricing-ParallelRequests.ipynb](https://github.com/LSEG-API-Samples/Example.DataLibrary.Python/blob/lseg-data-examples/Examples/2-Content/2.01-HistoricalPricing/EX-2.01.02-HistoricalPricing-ParallelRequests.ipynb), which only checks whether each response is successful.
250+
251+
```python
252+
def display_reponse(response):
253+
print(response)
254+
print("\nReponse received for", response.closure)
255+
if response.is_success:
256+
display(response.data.df)
257+
else:
258+
print(response.http_status)
259+
```
260+
261+
This `display_response` handles Python exceptions that can appear in the returned list when using `asyncio.gather(..., return_exceptions=True)`, in addition to HTTP-level failures. This makes concurrent request handling easier to debug and safer in real applications.
262+
263+
### Requesting Data
264+
265+
Next, we group multiple calls to the `get_data_async` method with `asyncio.gather()` and run them as awaitable coroutines.
266+
267+
I am demonstrating with `historical_pricing.events.Definition` definition.
275268

276269
```python
277270
# Convert dictionary keys to a list of RIC symbols (kept for quick inspection/debugging).
@@ -301,12 +294,16 @@ except* LDError as errors:
301294
print(error)
302295
```
303296

304-
When sending multiple single-RIC requests, each RIC returns one corresponding response object, and gather returns a list.
297+
![event definition dataframe results](/images/04_dataframe_1.png)
298+
299+
When sending multiple Historical Pricing Definition with **a single RIC** request, each RIC gets its own data response grouping together sequently in a Python *list* returns from `await tasks` statement.
305300

306301
```python
307302
print(f" Data type is {type(historical_data)} and length is {len(historical_data)}")
308303
```
309304

305+
[image here]
306+
310307
You can extract a specific company response by closure label.
311308

312309
```python
@@ -317,7 +314,9 @@ next(
317314
)
318315
```
319316

320-
## Request Data with gather: Summaries Example
317+
[image here]
318+
319+
### Request Data with gather: Summaries Example
321320

322321
```python
323322
try:

README.md

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -14,21 +14,6 @@ This project is a semi-sequel to my [Concurrent Data Platform API Calls with Pyt
1414

1515
**Note**: This project is based on the Data Library for Python version 2.1.1. The library behavior might change in future releases.
1616

17-
---
18-
19-
## (Recap) What are Synchronous and Asynchronous Execution Models?
20-
21-
**Synchronous** code runs tasks one at a time — each request must complete before the next one starts. The program blocks and waits at every I/O-bound call, so if a request takes 60 seconds, nothing else runs for those 60 seconds. Fine for a single request, but a real bottleneck when fetching data with many calls.
22-
23-
![synchronous](images/02_synchronous_simple.png)
24-
25-
**Asynchronous** code lets multiple tasks run concurrently. While one request is waiting for a network response, the event loop hands control to the next task instead of sitting idle.
26-
27-
![asynchronous](images/04_asynchronous_simple.png)
28-
29-
The real payoff comes when you have **many requests to make**. With `asyncio.gather()` and `asyncio.TaskGroup()`, all requests are fired concurrently so the total time is roughly that of the single slowest response — not the sum of all response times.
30-
31-
---
3217

3318
## Throttling and Rate Limits
3419

File renamed without changes.

images/04_dataframe_1.png

108 KB
Loading

notebook/ld_notebook_async_gather.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@
6464
"id": "4b995977",
6565
"metadata": {},
6666
"source": [
67-
"Next, use the [python-dotenv](https://pypi.org/project/python-dotenv/) library to get the Data Platform credential from the `.env` file.\n",
67+
"Next, use the [python-dotenv](https://pypi.org/project/python-dotenv/) library to get the Data Platform credential from the `.env` file. The `os.getenv()` method supports to OS environment variables as well if you prefer.\n",
6868
"\n",
6969
"**Note**: The `.env` file **should not be committed** to the version control repository."
7070
]
@@ -1800,7 +1800,7 @@
18001800
],
18011801
"metadata": {
18021802
"kernelspec": {
1803-
"display_name": ".venv",
1803+
"display_name": "Python 3 (ipykernel)",
18041804
"language": "python",
18051805
"name": "python3"
18061806
},

0 commit comments

Comments
 (0)