Skip to content

Conversation

@ekalda
Copy link
Contributor

@ekalda ekalda commented Apr 1, 2022

Integrate the cascader into the codegen and optionally enable it
with the enable_cascader flag. Includes placeholder MemoryRegions until
integration with the PoolInfos provided by a user.

Co-authored-by: Matthew Barrett matthew.barrett@arm.com

@ekalda
Copy link
Contributor Author

ekalda commented Apr 1, 2022

@ekalda
Copy link
Contributor Author

ekalda commented Apr 1, 2022

This patch supersedes #10377, so that one can be closed.

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ekalda!

It is nice to see this working and green in CI.

I have left few comments around testing and user facing errors.


def _ethos_u55_cascader() -> Callable:
flash = MemoryRegion(name="FLASH", size=10 ** 7, read_bandwidth=4, write_bandwidth=4)
sram = MemoryRegion(name="SRAM", size=10 ** 6, read_bandwidth=16, write_bandwidth=16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Maybe for a subsequent PR] Will it be possible plumb these values using this :

IRModule func_module = WithAttrs(IRModule::FromExpr(func),
{{tvm::attr::kExecutor, executor_},
{tvm::attr::kRuntime, runtime_},
{tvm::attr::kWorkspaceMemoryPools, workspace_memory_pools_}});

An example test :

https://github.com/apache/tvm/blob/main/tests/python/relay/aot/test_crt_aot_usmp.py#L341-L383

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, with the caveat that we'll assume that there is one workspace pool in the system - is it ok to assume that for now or should we handle the case where there are several workspace pools where some of them are not accessible for the NPU?


return conv2d

infra.compare_tvm_with_tflite(tf_graph, [ifm_shape], accel_type, enable_cascader=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these tests, I think we need to ensure the memory usage is reduced.

We could use calculate_workspace_size TIR analysis utility for that. E.g. :

assert tvm.tir.analysis.calculate_workspace_bytes(primfunc, alignment) == size

OR we could use the final memory calculation in a non-USMP flow as follows :

def test_workspace_calculation(workspace_byte_alignment, main_workspace_size):
mod, params = tvm.relay.testing.synthetic.get_workload()
target = "c"
runtime = Runtime("crt")
executor = Executor(
"aot",
{
"workspace-byte-alignment": workspace_byte_alignment,
},
)
with tvm.transform.PassContext(
opt_level=3,
config={
"tir.disable_vectorize": True,
},
):
lib = tvm.relay.build(mod, target, executor=executor, runtime=runtime, params=params)
mlf_memory_map = mlf._build_function_memory_map(lib.function_metadata)
assert mlf_memory_map["main"][0]["workspace_size_bytes"] == main_workspace_size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the tests to check for reduction in memory instead of bitwise accuracy with TFLite (I think it would be still good to have some FVP based tests when cascader is enabled, but it looks like that would need quite a bit of infra refactor, so I'll do it in a separate patch, if this is ok)

Copy link
Contributor

@lhutton1 lhutton1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me modulo @manupa-arm's comments :)

@ekalda ekalda force-pushed the integrate-cascader2 branch from 1c20fcd to 2042445 Compare April 11, 2022 11:28
return compiler_attrs.accelerator_config


def enable_cascader():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better is_cascader_enabled, otherwise, it seems you are enabling it with this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I changed the function name to is_cascader_enabled, but kept the flag/variable as enable_cascader to align with the philosophy of enable_usmp

Copy link
Contributor

@NicolaLancellotti NicolaLancellotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you @ekalda.

Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly looks great!

just a nit and a question

mod, params, accel_type, pool_size, enable_cascader=True
)

assert workspace_size_cascader_enabled < workspace_size_cascader_disabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, should we not check for exact values ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point, I changed it to check for the exact values

)


def _extract_memory_info(memory_pool):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better for this utility to be part of the cascader and not get exposed to the codegen here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ekalda ekalda force-pushed the integrate-cascader2 branch 4 times, most recently from e764f8c to 094fbcb Compare April 21, 2022 13:46
ekalda and others added 8 commits April 22, 2022 09:30
Integrate the cascader into the codegen and optionally enable it
with the enable_cascader flag. Includes placeholder MemoryRegions until
integration with the PoolInfos provided by a user.

Co-authored-by: Matthew Barrett <matthew.barrett@arm.com>
Plumb the workspace memory pools into into the cascader and make
the tests to check for the memory reduction.
Change-Id: If2d92846f05a7e8b21be767163841084538805a9
@ekalda ekalda force-pushed the integrate-cascader2 branch from 094fbcb to f212de6 Compare April 22, 2022 09:11
Copy link
Contributor

@manupak manupak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a suggestion for a follow up.

), "Exactly one workspace pool needs to be provided for the U55 cascader"

sram = extract_memory_info(workspace_memory_pools.pools[0])
tir_mod = LowerToTIR(_ethos_u55_cascader(sram))(mod)
Copy link
Contributor

@manupak manupak Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a followup : Please consider absorbing the call to extract_memory_info inside the cascader. (Sorry for not being clear before). Ideally, we'd want to remove the "MemoryRegion" construct and to get there in the current direction of travel, we should try to confine the usage of it inside the cascader. Therefore the interface of the cascader should be made to accept MemoryPool(s).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this would be the right way to go about it. Let's do it in a follow up!

@manupak manupak merged commit d2db9cb into apache:main Apr 25, 2022
@manupak
Copy link
Contributor

manupak commented Apr 25, 2022

Thanks @ekalda @mbaret @lhutton1 @NicolaLancellotti !

This is merged now!

shtinsa pushed a commit to Deelvin/tvm that referenced this pull request May 17, 2022
* [microNPU] Integrate the cascader

Integrate the cascader into the codegen and optionally enable it
with the enable_cascader flag. Includes placeholder MemoryRegions until
integration with the PoolInfos provided by a user.

Co-authored-by: Matthew Barrett <matthew.barrett@arm.com>

* Fix linting and a docstring

* Plumbing and testing improvements

Plumb the workspace memory pools into into the cascader and make
the tests to check for the memory reduction.

* enable_cascader() -> is_cascader_enabled()

* Check for the exact value of workspace size

* Remove unused ACCEL_TYPES

* Linting...

Change-Id: If2d92846f05a7e8b21be767163841084538805a9

* Rebasing...

Co-authored-by: Matthew Barrett <matthew.barrett@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants