Skip to content

Conversation

@juimonen
Copy link

Codec adapter has a devm_kzalloc type memory allocator to allocate
multiple chunks of memory into a list, which can be then freed without
explicit pointers in codec deletion. Move this generic mechanism to
pipeline class. This is possible as recent changes in driver create the
pipeline before its components.

Signed-off-by: Jaska Uimonen jaska.uimonen@intel.com

@juimonen
Copy link
Author

Help in testing accepted :)

Recent changes in driver should create the pipeline before components -> we could store all memory pointers in 1 place i.e. pipeline that holds the components.

@lgirdwood
Copy link
Member

@RanderWang please use this in your work.
@juimonen are all the dependencies upstream ?

@RanderWang
Copy link
Collaborator

@lgirdwood @juimonen I want to make allocation and deallocation more simple. For audio case, we allocate memory at initializing stage and never allocate any memory dynamically when doing playback or capture. So we can allocate big block of memory based on pipeline memory information which contains all memory usage for each modules, then just increase memory used pointer for each allocation which will save memory without any container. And we just return the whole memory block to OS when pipeline is released.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass in caps here too.

@lgirdwood
Copy link
Member

@lgirdwood @juimonen I want to make allocation and deallocation more simple. For audio case, we allocate memory at initializing stage and never allocate any memory dynamically when doing playback or capture.

In most cases this is fine, but there are some cases when we need to load new coefficients or different config data that means a runtime reallocation.

So we can allocate big block of memory based on pipeline memory information which contains all memory usage for each modules, then just increase memory used pointer for each allocation which will save memory without any container. And we just return the whole memory block to OS when pipeline is released.

Let's be more clever here. Passing in the memory size from topology puts a lot of effort on the topology developer. The compiler can do this for us in most cases. We need to add a heap_min and a heap_max in the static component data where we can do things like e.g.

struct comp x {
    .heap_min = align_up(sizeof(x_priv_data) + sizeof(data1) + CONFIG_X_ITEMS * sizeof(data2), cache_line_size),
    .heap_max = "heap_min" + CONFIG_X_MAX_ITEM2 * sizeof(data2),
};

So heap_min will be allocated at new and heap_max can be pre-allocated (i.e. it's SRAM can be powered OFF until needed). But this would work in teh same "increment the pointer allocation method"

I think @juimonen work is the 1st stage in this work, i.e we can merge it first and then build on top.

@RanderWang
Copy link
Collaborator

@lgirdwood @juimonen I want to make allocation and deallocation more simple. For audio case, we allocate memory at initializing stage and never allocate any memory dynamically when doing playback or capture.

In most cases this is fine, but there are some cases when we need to load new coefficients or different config data that means a runtime reallocation.

So we can allocate big block of memory based on pipeline memory information which contains all memory usage for each modules, then just increase memory used pointer for each allocation which will save memory without any container. And we just return the whole memory block to OS when pipeline is released.

Let's be more clever here. Passing in the memory size from topology puts a lot of effort on the topology developer. The compiler can do this for us in most cases. We need to add a heap_min and a heap_max in the static component data where we can do things like e.g.

Actually we do memory usage calculation in driver and also depend on compiler to calculate bss size and store it in module config by rimage when analyzing fw elf file (although rimage doesn't do it now since no such module support in sof fw).

struct comp x {
    .heap_min = align_up(sizeof(x_priv_data) + sizeof(data1) + CONFIG_X_ITEMS * sizeof(data2), cache_line_size),
    .heap_max = "heap_min" + CONFIG_X_MAX_ITEM2 * sizeof(data2),
};

So heap_min will be allocated at new and heap_max can be pre-allocated (i.e. it's SRAM can be powered OFF until needed). But this would work in teh same "increment the pointer allocation method"

I think @juimonen work is the 1st stage in this work, i.e we can merge it first and then build on top.

@lgirdwood
Copy link
Member

Actually we do memory usage calculation in driver and also depend on compiler to calculate bss size and store it in module config by rimage when analyzing fw elf file (although rimage doesn't do it now since no such module support in sof fw).

We need to do this for a short time, but it's not long term viable since it tightly couples the kernel to the FW and can mean kernel fixes to fix FW issues.

Can you paste the calculation code here ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just confirm that this is intended and correct - before the change this was freeing one specific object, now after this change this is freeing all allocated blocks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All pipeline objects will be in a block, but this does need checking here.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general comment: in the kernel devm_ means automatic resource management. pipeline_devm_alloc() added in this PR doesn't seem to be doing that. If adding that automation is planned in the near future, maybe it's ok, but it would be even better to first add that and then implement this to make sure we don't delete the code we added a short time ago. If this isn't planned maybe it would be better not to use devm_ here

@lgirdwood
Copy link
Member

A general comment: in the kernel devm_ means automatic resource management. pipeline_devm_alloc() added in this PR doesn't seem to be doing that. If adding that automation is planned in the near future, maybe it's ok, but it would be even better to first add that and then implement this to make sure we don't delete the code we added a short time ago. If this isn't planned maybe it would be better not to use devm_ here

Intention this should be adding each pipeline allocation to a list and freeing when pipeline is destroyed. @juimonen will this be in subsequent patches ? @ranj063 @RanderWang fyi

@RanderWang
Copy link
Collaborator

Actually we do memory usage calculation in driver and also depend on compiler to calculate bss size and store it in module config by rimage when analyzing fw elf file (although rimage doesn't do it now since no such module support in sof fw).

We need to do this for a short time, but it's not long term viable since it tightly couples the kernel to the FW and can mean kernel fixes to fix FW issues.

Can you paste the calculation code here ?

It is for CAVS FW. bss is stored in module_entry.

        ret = processor->get_mem_size(sdev, swidget, &ibs, &obs, &bss);
        if (ret < 0)
                return ret;

        task_mem = SOF_IPC4_PIPELINE_OBJECT_SIZE;
        task_mem += SOF_IPC4_MODULE_INSTANCE_LIST_ITEM_SIZE + bss;

        if (sdev->fw_modules[module_id].type & SOF_IPC4_MODULE_LL)
        {
                task_mem += SOF_IPC4_FW_ROUNDUP(SOF_IPC4_LL_TASK_OBJECT_SIZE);
                task_mem += SOF_IPC4_FW_MAX_QUEUE_COUNT * SOF_IPC4_MODULE_INSTANCE_LIST_ITEM_SIZE;
                task_mem += SOF_IPC4_LL_TASK_LIST_ITEM_SIZE;
        }
        else
        {
                task_mem += SOF_IPC4_FW_ROUNDUP(SOF_IPC4_DP_TASK_OBJECT_SIZE);
                task_mem += SOF_IPC4_DP_TASK_LIST_SIZE;
        }

        ibs = SOF_IPC4_FW_ROUNDUP(ibs);
        queue_mem = SOF_IPC4_FW_MAX_QUEUE_COUNT * (SOF_IPC4_DATA_QUEUE_OBJECT_SIZE +  4 * ibs);

        total = SOF_IPC4_FW_PAGE(task_mem + queue_mem);
        if (total > SOF_IPC4_FW_MAX_PAGE_COUNT) {
                dev_info(sdev->dev, "task memory usage %d, queue memory usage %d", task_mem, queue_mem);
                total = SOF_IPC4_FW_MAX_PAGE_COUNT;
        }

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks better IMHO!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline_devm_alloc?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rfree(container)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment isn't valid any more?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will remove

@juimonen
Copy link
Author

@lyakh fyi

  • tried to address your latest comments
  • added now devm_free for possible individual mem region free's (to be concise with kernel devm, not sure if it is ever used)
  • free_all is now static and used only in pipeline_free (so all reserved memory will be freed when pipeline is destroyed)

@lgirdwood
Copy link
Member

@RanderWang any other input from you - this will be a fist step from which we can integrate more into IPC4 code.

Codec adapter has a devm_kzalloc type memory allocator to allocate
multiple chunks of memory into a list, which can be then freed without
explicit pointers in codec deletion. Move this generic mechanism to
pipeline class. This is possible as recent changes in driver create the
pipeline before its components.

Signed-off-by: Jaska Uimonen <jaska.uimonen@intel.com>
@juimonen
Copy link
Author

Can I somehow tell CI to ignore this CamelCase error, which I haven't even introduced?

@lgirdwood
Copy link
Member

Can I somehow tell CI to ignore this CamelCase error, which I haven't even introduced?

Dont worry too much about the CamelCase, the doxygen and build reports are the CIs that need to be fixed.

@lgirdwood lgirdwood added this to the v1.9 milestone Jul 9, 2021
@lgirdwood
Copy link
Member

@RanderWang is this ready to merge on ipc4 branch yet ?

@RanderWang
Copy link
Collaborator

@RanderWang is this ready to merge on ipc4 branch yet ?

@lgirdwood this is not complete. We need to replace all the memory allocation related.
(1) host & dai component which are invoked by copier module
(2) pipeline new and buffer new which will affect current sof design, or design new function only for ipc4

I can refine ipc4 layer and new ipc4 module to allocate memory from pipeline.

}
}

void *pipeline_devm_alloc(struct pipeline *pipe, uint32_t size, uint32_t alignment)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need add caps for args since for ipc4 we need difference caps for different cases

@lgirdwood
Copy link
Member

@ranj063 fyi - needed for module API

return 0;

/* Find which container keeps this memory */
list_for_item_safe(mem_list, _mem_list, &pipe->mem.mem_list) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for _safe - the function only deletes a single element once and returns immediately after that.

return ptr;
}

int pipeline_devm_free(struct pipeline *pipe, void *ptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's never called, do we need it?

@sys-pt1s
Copy link

Can one of the admins verify this patch?

@lgirdwood
Copy link
Member

Close - can re-open for integration after module API if needed.

@lgirdwood lgirdwood closed this Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants