Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI Proposal TAM stream telemetry #2089

Merged
merged 2 commits into from
Dec 10, 2024

Conversation

Pterosaur
Copy link
Contributor

Proposal a new streaming method for telemetry

@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 7 times, most recently from 4098992 to e62f676 Compare October 12, 2024 02:41
@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch from 515bbf2 to b033dfa Compare October 22, 2024 12:12
inc/saitam.h Outdated
* @brief Tam telemetry reporting type
*
* @type sai_tam_reporting_type_t
* @flags CREATE_AND_SET
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the CREATE_AND_SET attributes should all be CREATE_ONLY

inc/saitam.h Outdated
* @type sai_u8_list_t
* @flags READ_ONLY
*/
SAI_TAM_REPORT_ATTR_IPFIX_TEMPLATE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sessions that exceeds the max length of template or data, we need to split the template into multiple templates:

  • Modify the template type to sai_u8_list_list_t
  • Rename SAI_TAM_REPORT_ATTR_IPFIX_TEMPLATE to SAI_TAM_REPORT_ATTR_IPFIX_TEMPLATE_LIST
  • Rename SAI_TAM_REPORT_ATTR_REPORT_IPFIX_TEMPLATE_ID to SAI_TAM_REPORT_ATTR_REPORT_IPFIX_BASE_TEMPLATE_ID

Each template data has the template id in it, so we don't need another attribute or field to specify the template id list.

inc/saitam.h Outdated
* @flags CREATE_AND_SET
* @default 0
*/
SAI_TAM_TELEMETRY_ATTR_CACHE_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the ring buffer is created on driver load, we need to move it to sai switch attribute.

inc/saitam.h Outdated
/**
* @brief Report type is count based
*/
SAI_TAM_REPORTING_TYPE_COUNT_BASED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to SIZE_BASED to be more explicit.

inc/saitam.h Outdated
* @default 1
* @validonly SAI_TAM_TELEMETRY_ATTR_TAM_REPORTING_TYPE == SAI_TAM_REPORTING_TYPE_COUNT_BASED
*/
SAI_TAM_TELEMETRY_ATTR_REPORTING_CHUNK_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHUNK_BYTES

sai_attr_list[2].value.oid = SAI_PORT_STAT_IF_IN_OCTETS;

sai_attr_list[3].id = SAI_TAM_COUNTER_SUBSCRIPTION_ATTR_LABEL;
sai_attr_list[3].value.oid = index; // Element ID of the object in the IPFIX template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add explicit capability check here instead of leveraging the return value.

for capability query, we need to add a new API in saiobject.h:

sai_status_t sai_query_stats_hft_capability(
        _In_ sai_object_id_t switch_id,
        _In_ sai_object_type_t object_type,
        _Inout_ sai_stat_hft_capability_list_t *stats_capability);

Then in saitypes.h, add the sai_stat_hft_capability_list_t following the sai_stat_capability_list_t, and we need to add the minimum polling interval in it.

sai_attr_list[3].id = SAI_TAM_COUNTER_SUBSCRIPTION_ATTR_LABEL;
sai_attr_list[3].value.oid = index; // Element ID of the object in the IPFIX template

attr_count = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to support the get-and-clear operation on queue watermark counters, we need another attribute to specify the mode.

sai_attr_list[2].id = SAI_TAM_COUNTER_SUBSCRIPTION_ATTR_STAT_ID;
sai_attr_list[2].value.oid = SAI_PORT_STAT_IF_IN_OCTETS;

sai_attr_list[3].id = SAI_TAM_COUNTER_SUBSCRIPTION_ATTR_LABEL;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update label comment on sai.h

- The number of stats types for a single SAI object type should not exceed 32,768.
- The number of extension stats types for a single SAI object type should not exceed 32,768.
- The number of SAI objects of the same type should not exceed 32,768.
- The vendor SDK should support publishing stats in IPFIX format and its IPFIX template.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPFIX template is created using SAI APIs. Based on that NOS can derive the template and there is no need for SAI or SDK to publish IPFIX template

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the IPFIX recording is published by the vendor, So, according to the self-consistency principle, the IPFIX template will also be published from the vendor side as well. For example, we don't limit the order of data in the IPFIX , it's acceptable once the order of date is consistent in the IPFIX template or recording.
We expect the generation of all IPFIX template or reocrding is maintained by one side.

- The number of extension stats types for a single SAI object type should not exceed 32,768.
- The number of SAI objects of the same type should not exceed 32,768.
- The vendor SDK should support publishing stats in IPFIX format and its IPFIX template.
- If a polling frequency for stats cannot be supported, the vendor's SDK should report this error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supported polling frequency should be part of the SAI capability query. Based on this if NOS configures a unsupported poll frequency SAI will return error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, agree. Will provide a new API to query this capability

*
* @return #SAI_STATUS_SUCCESS on success, #SAI_STATUS_BUFFER_OVERFLOW if lists size insufficient, failure status code on error
*/
sai_status_t sai_query_stats_st_capability(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should make this more generic ? since very similar function already exists and this is adding only 1 field to structure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi Kamil, we discussed internally using the sai_query_stats_capability directly and add the polling interval in it, but the reason we are not using that API is because of 2 reasons:

  1. Not all counters support using this way to retrieve, so if the counters are not returned in the list, it means it is not supported.
  2. Adding another field in the current stats capability breaks the ABI and making this frequently used function not backward compatible.

So after discussing it, we ended up deciding creating a new API for it.

@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 17 times, most recently from db5391e to e77e917 Compare October 28, 2024 12:00
@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch from 39bae33 to 4c818a9 Compare December 3, 2024 07:33
inc/saiswitch.h Show resolved Hide resolved
inc/saiswitch.h Outdated
* If the collector isn't ready to receive reports, this value indicates how many
* bytes of reports that can be cached.
* 0 means no cache which is the default behavior.
* If the cache is full, new incoming data will be dropped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity. This is not a cache but just a ring buffer of chunk size of max 64k.
TAM_BUFFER_SIZE = n*REPORT_CHUNK_SIZE, where n is the number of chunks

SAI_TAM_TEL_TYPE_STATE_CONFIG_READY,

} sai_tam_tel_type_state_t;

Copy link
Contributor

@JaiOCP JaiOCP Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpik, names of the attributes are not clear about the intent of the state machine.
I belive what you are trying to do is as follows
Default State: READY
Once all the subscription is completed, orchagent want to indiate to SAI that it can now start the creation of templates
Next State: CREATE
SAI adapter will do out of sync notification of template creation complete and once orchagent reads the templates then must transition to state to that hw should start streaming now
Next State: START
If there is any change in NOS or streaming need to be stopped then HW wil be instructed to stop streaming.
Next State: STOP
Once SAI adapter cleans its own data stucture etc, NOS will transition the state to ready
Next State: READY

FSM will look like this
READY -> CREATE - > [async notifcation] -> START -> STOP -> READY
There are no intermediate state transitions.

inc/saitam.h Outdated
* @type sai_tam_tel_type_state_t
* @flags READ_ONLY
*/
SAI_TAM_TEL_TYPE_ATTR_CURRENT_STATE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you need this attribute

@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 3 times, most recently from d5b47dd to 246f3a7 Compare December 5, 2024 15:56
inc/saiswitch.h Outdated
* @flags CREATE_AND_SET
* @default 65535
*/
SAI_SWITCH_ATTR_TAM_REPORT_CHUNK_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to the TAM_REPORT (or the TAM) object being used. It doesn't make sense as a switch attribute and moving to ta/tam_report also allows different chunk sizes per-report/per-tam if allowed by implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, It was the previous design. But we discussed with some vendors, they said they can only support global ring buffer configuration and cannot be changed at runtime.

Copy link
Contributor

@bandaru-viswanath bandaru-viswanath Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The attribute atleast needs to be renamed to include "ST" or equivalent token since TAM is too generic and there are many other TAM features, and TAM_REPORT_CHUNK_SIZE is too generic.

inc/saiswitch.h Outdated
* @flags CREATE_ONLY
* @default 1
*/
SAI_SWITCH_ATTR_TAM_CHUNK_COUNT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to TAM object as opposed to keeping in SWITCH object ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above.

Copy link
Contributor

@bandaru-viswanath bandaru-viswanath Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The attribute atleast needs to be renamed to include ST or equivalent token since TAM is too generic and there are many other TAM features

* @flags CREATE_AND_SET
* @default NULL
*/
SAI_SWITCH_ATTR_TAM_TEL_TYPE_CONFIG_CHANGE_NOTIFY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to TAM object ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all notify handlers have to be bound on the switch object. It's the limitation of SAI design.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if its SAI design limitation, or just that all callback handlers are at switch level currently. @kcudnik please comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Bandaru, yes, in my understanding, this is basically how the notify handler works. The handler will be used by all objects in that type, hence it is added on upper layers, and the best place is the switch, as it will be setup during the initialization.

For example, fob handler will be used by all fdb entries, hence it is not tied to any fdb entry.

This is why the notifications are added on the switch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if its SAI design limitation, or just that all callback handlers are at switch level currently. @kcudnik please comment.

Currently it's forced that all notifications be in switch, this will unify entire design, and it will help update pointers when switch is put into warm boot mode, currently we don't have specific notifications per object and we are not planning to do so, more explanation is like Riff said in above comment

Copy link
Collaborator

@kcudnik kcudnik Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can return object_id of that tam in notification, and act accordingly, we currently are not planning to decouple pointers from switch to single objects, we generate a lot of metadat which help us organize everygnig, including notification pointers assignments, which are coupled to the switch, at this point it will be up to user to decide differnt action based on notificaion data

do you have any concearns about that? like performance reasons ? will there be expected a lot of tam notifications per second ?

internally in vendor implementation you can still have "differnet" pointers on each created internally tam objects, but assign them the same pointer that was given by switch object

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a restriction in the metadata and other aspects, I think it is OK. I will wait for the name edits that I suggested above, before approving this PR. This callback item is closed from my side.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not hard restrictions, but we're made for convince, and drifting away from current schema would cause a lot of refactoring

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats OK, I am not talking about hard restrictions alone. No need to change all the infra for this for now.

However, I request you to consider the possibility that the callbacks may - at sometime in future - may be per-object and plan any changes for this. Thanks @kcudnik

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is also another side why all pointers are in switch, during warm boot of the platform, user is calling create_switch, and it can pass new pointers to the switch to receive notifications, this is importnt, since after platform warm boot, new notifications wii most likely have different memory address than previous boot, so notifications would need to be updated. and this could be done in a single call during create switch, if you would have pointers in objects , you would need to go and find that object and assign new notification on it or you would have invalid pointer, or null, depends how platform handles pointers in sdk

@@ -2370,6 +2460,8 @@ typedef struct _sai_tam_api_t
sai_remove_tam_counter_subscription_fn remove_tam_counter_subscription;
sai_set_tam_counter_subscription_attribute_fn set_tam_counter_subscription_attribute;
sai_get_tam_counter_subscription_attribute_fn get_tam_counter_subscription_attribute;
sai_bulk_object_create_fn create_tam_counter_subscriptions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a tam string as part of the type for consistency.

Copy link
Contributor Author

@Pterosaur Pterosaur Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a convention of SAI to use sai_bulk_object_xxxx for bulk operation.

SAI/inc/sailag.h

Lines 395 to 396 in ad20062

sai_bulk_object_create_fn create_lag_members;
sai_bulk_object_remove_fn remove_lag_members;

SAI/inc/sainexthop.h

Lines 335 to 338 in ad20062

sai_bulk_object_create_fn create_next_hops;
sai_bulk_object_remove_fn remove_next_hops;
sai_bulk_object_set_attribute_fn set_next_hops_attribute;
sai_bulk_object_get_attribute_fn get_next_hops_attribute;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 2 times, most recently from 323db70 to 8762286 Compare December 6, 2024 06:45
inc/saitam.h Show resolved Hide resolved
@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch from 8762286 to 4a210e3 Compare December 6, 2024 15:21
@Pterosaur Pterosaur requested a review from kcudnik December 6, 2024 15:24
@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 2 times, most recently from b0f2f3c to 4a210e3 Compare December 6, 2024 15:39
@kcudnik
Copy link
Collaborator

kcudnik commented Dec 6, 2024

please fix build

@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch 2 times, most recently from e78949d to b704cf9 Compare December 6, 2024 16:00
@Pterosaur
Copy link
Contributor Author

Pterosaur commented Dec 6, 2024

please fix build

Hi @kcudnik , it was failed on the ancestry check. I feel the git graph check has an issue, it shouldn't check the conflict on these commits belong to this PR but check the conflict between PR with master branch.
Otherwise, if I want to fix the ancestry check, I have to rebase the commit history.

./checkancestry.sh
ancestry graph
*   74f9cf8 Merge 4a210e373be4db0c1e7a980150000509ceb4852d into ad20062eb1f2a24aa79983422e83b921b49c011f
|\  
| * 4a210e3 Change the default value for chunk count
| * c449e5d Merge branch 'master' into ipfix_stream_telemetry
|/  
* ad20062 Introduce new extended port oper status notification (#2087)
git rev list from 5f75d99 to HEAD
git checkout work tree commits: 5f75d99 91ef792 97c8ed2 ad20062 c449e5d 4a210e3 74f9cf8
loaded history from ancestry.5f75d99.history

@kcudnik
Copy link
Collaborator

kcudnik commented Dec 6, 2024

There is no easy way to do this you need to squash your commits and force push

Signed-off-by: Ze Gan <[email protected]>
@Pterosaur Pterosaur force-pushed the ipfix_stream_telemetry branch from b704cf9 to 9de0f17 Compare December 7, 2024 04:40
* @type sai_tam_tel_type_mode_t
* @flags CREATE_ONLY
* @default SAI_TAM_TEL_TYPE_MODE_SINGLE_TYPE
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, it should be marked as:
@validonly SAI_TAM_TEL_TYPE_ATTR_TAM_TELEMETRY_TYPE == SAI_TAM_TELEMETRY_TYPE_COUNTER_SUBSCRIPTION

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea,thanks for your suggestion.

@Pterosaur
Copy link
Contributor Author

@kcudnik Could you please help to merge this PR?

@kcudnik kcudnik merged commit 2602b2a into opencomputeproject:master Dec 10, 2024
3 checks passed
@Pterosaur Pterosaur deleted the ipfix_stream_telemetry branch December 10, 2024 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewed PR is discussed in SAI Meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants