Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASoC: SOF: hda: fix infinite XRUN after reusuming from S3 #1072

Merged
merged 2 commits into from
Jul 9, 2019

Conversation

RanderWang
Copy link

BugLink: thesofproject/sof#1594

In ASoC, hda codecs and SOF driver set the same stream tag
to hardware for a stream and this stream tag should be restored
to hardware after resuming from S3. For this bug, there are
two capture streams active before S3 and one is released when
doing S3. After resuming from S3, hda codec driver restores
stream tag for the active stream, but SOF restores another
stream tag which is allocated again, and this makes controller
don't get any capture data, then infinite XRUN happens in FW.

Now this patch checks whether there is a stored dma data for
stream resuming from S3 and reuses it. And it also removes
dma data when the stream is released.

Tested on whiskylake with SOF driver.

It fixes thesofproject/sof#1594

@keyonjie
Copy link

keyonjie commented Jul 3, 2019

Hi @RanderWang are you meaning that for the active stream, we didn't call hda_link_hw_free() before entering S3?

@RanderWang
Copy link
Author

RanderWang commented Jul 3, 2019

Hi @RanderWang are you meaning that for the active stream, we didn't call hda_link_hw_free() before entering S3?

This should not be done by driver. It is maintained by ASoC.
For active stream, ASoC doesn't invoke any free function when doing S3.

@keyonjie
Copy link

keyonjie commented Jul 3, 2019

Hi @RanderWang are you meaning that for the active stream, we didn't call hda_link_hw_free() before entering S3?

This should not be done by driver. It is maintained by ASoC.
For active stream, ASoC doesn't invoke any free function when doing S3.

If hw_free() is not called before suspend, we shouldn't call hw_params() at resume, isn't it?

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 3, 2019

@keyonjie wrote:

If hw_free() is not called before suspend, we shouldn't call hw_params() at resume, isn't it?

I'm afraid we have to as the DSP has lost its state due to power cycle, so we need to program the link DMA again. ASoC won't call hw_params again as hw_params have not changed, but it's upto us to reprorammg the hardware in prepare().

@RanderWang what I don't understand from the patch is that why hda_link_stream_assign() gets confused -- the functions seems to be written in a way that it can be called multiple times (like hw_params() can) and should return the correct result. I.e. hda_check_fes() should always find a matching result. When you say other stream is removed, hda_check_fes() should still work for the remainin stream, right?

All in all this is a very good find, thanks Rander!

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 3, 2019

@RanderWang I have some comments on the commit message:

ASoC: SOF: hda: fix infinite XRUN after reusuming from S3

The XRUN loop is a second order outcome. How about:

ASoC: SOF: hda: fix link DMA config

BugLink: thesofproject/sof#1594

Don't but the buglink at start of git commit. I don't think we have established practise for this.
Pierre has done this a couple of times and the patches have been accepted, so should be ok:

GitHub issue: thesofproject/sof#1594

And put it at end end, in the same section as the signed-off-by line (see Pierre's patches).

two capture streams active before S3 and one is released when
doing S3. After resuming from S3, hda codec driver restores

This sounds a little odd. Is the other stream terminated by the user while S3 resume is still not complete? Does the audio file end during S3 resume?

Tested on whiskylake with SOF driver.

Please use the correct spelling: "Tested on a Whiskey Lake platform".
https://ark.intel.com/content/www/us/en/ark/products/codename/135883/whiskey-lake.html

As this is a SOF driver patch, no need to say you tested with SOF driver.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small issues and clarifirication needs, but otherwise looks good -- a very good find!

@ranj063
Copy link
Collaborator

ranj063 commented Jul 3, 2019

Good finding @RanderWang . Please fix typo in the commit message apart from the things @kv2019i mentioned too reusuming -> resuming

@RanderWang
Copy link
Author

@RanderWang what I don't understand from the patch is that why hda_link_stream_assign() gets confused -- the functions seems to be written in a way that it can be called multiple times (like hw_params() can) and should return the correct result. I.e. hda_check_fes() should always find a matching result. When you say other stream is removed, hda_check_fes() should still work for the remainin stream, right?

Thanks! I will refine my explanation. Actually, there are two pcm capture streams, which is composed of a FE stream and a BE stream. One of the pcm capture stream is released when doing S3. hda_check_fes is still work for another pcm capture stream.

@RanderWang
Copy link
Author

two capture streams active before S3 and one is released when
doing S3. After resuming from S3, hda codec driver restores

This sounds a little odd. Is the other stream terminated by the user while S3 resume is still not complete? Does the audio file end during S3 resume?

I will refine my comments. The other unused pcm capture stream is terminated by pulse audio and it occurs before S3. The termination is not caused by S3.

@RanderWang
Copy link
Author

RanderWang commented Jul 4, 2019

Updated my PR and fix another stream id issue(please check aab9bf9)

@ranj063 Please check the new change which modifies your algorithm for S3. Thanks!
I also test the case for you patch : play a music --- pause playback -- do S3 suspend ---do S3 resume --- continue playback .

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 4, 2019

@RanderWang The patch looks good now, but I'm still having difficulty with the commit message.

  • You still misspell Whiskey Lake (Whiskey not Whisky) :D See the link above for the official page.

  • This sentence doesn't calculate:

After system resume, hda codec driver restores the stream tag for
the active pcm stream, but SOF restores another stream tag which is
allocated with a different stream tag, and this makes controller

A stream tag allocated with a different stream tag, this doesn't make sense.
How about this wording:

After system resume, hda codec driver restores the stream tag for
the active pcm stream, but SOF goes to assign a new one, which now
doesn't match with the stream tag used by codec driver, and this makes
controller...

... I think this is the main issue. What I still don't fully understand is
that why hda_link_stream_assign() doesn't find the right stream_tag
with current code. We are not freeing the 'link_dev' instance, so
the data with the right tag should still be there. Oh well, you don't
need to explain this in the commit -- this is more to understand
whether the fix is right or not.

@@ -330,7 +330,8 @@ static int hda_link_pcm_trigger(struct snd_pcm_substream *substream,
if (ret < 0)
return ret;
stream_tag = hdac_stream(link_dev)->stream_tag;
snd_hdac_ext_link_clear_stream_id(link, stream_tag);
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
snd_hdac_ext_link_clear_stream_id(link, stream_tag);
Copy link
Collaborator

@kv2019i kv2019i Jul 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang But why? We need link and host DMA for capture as well, why the different treatment?
UPDATE: answering to myself: I was confused by HDAC interface. snd_hdac_ext_link_clear_stream_id() specifically modifies status for output stream as per HDA spec. Nothing in hdac_ext code hints at this, which is confusing. But oh well, this is outside your patch, sorry for the noise. This second patch looks good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I think this is the main issue. What I still don't fully understand is
that why hda_link_stream_assign() doesn't find the right stream_tag
with current code.

@RanderWang I've been trying to reproduce this by on WHL but to no avail.
While I still don't fully get how stream_assign() is supposed to work,
it would seem your patch is at least the safer option.

It's confusing that stream_assign() basicly returns the first available hdac-ext stream, and if there
are any non-opened streams in the bus (with matching direction), the caller will get a random stream back. The assumption made is that this only happens in non-host-dma cases, but it's very hard to verify from code review this assumption is ok.

Copy link
Author

@RanderWang RanderWang Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i very good finding !
the author of this function is @ranj063 , but I can share my understanding.
In ASoC, the calling sequence of playback is : FE stream open --- BE stream hw_param --- FE stream hw_param. The hdac_ext stream for FE stream is allocated in hda_dsp_pcm_open firstly, and for BE stream is done in hda_link_hw_params later. So in stream_assign, the assumption is if there are FE & BE for a pcm stream, they will be combined with the same tag when searching list from head to tail.

This is not reasonable, and the allocation is not optimized, .e.g
(1) pcm1 FE open + BE hw_param// both stream tag 2 is allocated for the FE & BE
(2) pcm2 FE open// stream tag 3 is allocated for the FE
(3) pcm1 FE & BE release //both stream tag 2 is released for the FE & BE
(4) pcm2 BE hw_param // stream tag 2 is allocated for this BE for the FE with stream tag 2 is not used now. But the optimized way is allocating stream tag 3 to the BE.

We should check all the streams to find whether there is a stream used by the FE of this BE, if there is none, the BE should be a hostless stream.

My patch is not for this case. I think we should not release stream for BE when doing S3. the hda stream for FE in SOF driver is still active and the stream tag is still used by hda codec driver, so we should let ASoC to release all the streams in normal way

Copy link
Author

@RanderWang RanderWang Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether you know the background of hda dma. There are two hda dma engines : host dma and link dma. For SOF, dsp is enabled and hda dma is set to decoupled mode. host dma channel number is not necessary equal to link dma for a pcm stream. Also there is a case of N host dma linked to M link dma . But for a pcm with only one FE and BE, it is best to use the same channel number(stream tag) for a hardware requirement( it is a long story...).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RanderWang for the explanation! The patch and commit message now look ok.
What confused me was the documentation (in hdac library) for snd_hdac_ext_link_clear_stream_id(). It doesn't describe that in HDA spec, this property is only valid for outputs. But yeah. outside this patch.

@RanderWang
Copy link
Author

@RanderWang The patch looks good now, but I'm still having difficulty with the commit message.

  • You still misspell Whiskey Lake (Whiskey not Whisky) :D See the link above for the official page.
  • This sentence doesn't calculate:

After system resume, hda codec driver restores the stream tag for
the active pcm stream, but SOF restores another stream tag which is
allocated with a different stream tag, and this makes controller

A stream tag allocated with a different stream tag, this doesn't make sense.
How about this wording:

After system resume, hda codec driver restores the stream tag for
the active pcm stream, but SOF goes to assign a new one, which now
doesn't match with the stream tag used by codec driver, and this makes
controller...

... I think this is the main issue. What I still don't fully understand is
that why hda_link_stream_assign() doesn't find the right stream_tag
with current code. We are not freeing the 'link_dev' instance, so
the data with the right tag should still be there. Oh well, you don't
need to explain this in the commit -- this is more to understand
whether the fix is right or not.

@kv2019i Thanks for your help on commit message. I will study and refine it.

if (hdac_stream(stream)->direction ==
SNDRV_PCM_STREAM_CAPTURE)
continue;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the same format:

	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
		snd_hdac_ext_link_clear_stream_id(link, stream_tag);

Copy link
Author

@RanderWang RanderWang Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of limitation of 80 char. With the line of if(xxxx),
the size of line "snd_hdac_ext_link_clear_stream_id(link, stream_tag);"
is also more than 80 chars. to make it like:
snd_hdac_ext_link_clear_stream_id(link,
stream_tag);
It is still also more than 80 chars. I did this is to make it beautiful

@@ -372,7 +373,8 @@ static int hda_link_hw_free(struct snd_pcm_substream *substream,
return -EINVAL;

stream_tag = hdac_stream(link_dev)->stream_tag;
snd_hdac_ext_link_clear_stream_id(link, stream_tag);
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
snd_hdac_ext_link_clear_stream_id(link, stream_tag);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is better

if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
    stream_tag = hdac_stream(link_dev)->stream_tag;
    snd_hdac_ext_link_clear_stream_id(link, stream_tag);
}

like other codes in the driver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will refine it.

@RanderWang
Copy link
Author

update my PR to refine code style suggested by Libin

ranj063
ranj063 previously approved these changes Jul 5, 2019
Copy link
Collaborator

@ranj063 ranj063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks!

@RanderWang
Copy link
Author

Rebased PR, thanks!

@RanderWang
Copy link
Author

The failed CI test is not caused by this PR.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. Sorry to nitpick on the commit message language, but with a complicated issue like this, the commit message is very important. So patch ok (including second patch&commit-msg), but a few improvements to the first patch commit msg:

In ASoC, hda codec and SOF driver set the same stream tag to
hardware for a pcm stream and this stream tag should be restored
to hardware after resuming from system suspend. For this bug,
there are two capture pcm streams active before doing system
suspend and one is terminated by user for the stream is unused
and its stream tag is released. Later after doing system suspend

For this bug, there are two capture pcm streams active, with one
stream and its related stream tag released before suspend.

the stream tag for remained active stream is released by SOF driver.

Later when system suspend is done, the stream tag for the remaining
active stream is released by SOF driver.

doesn't match with the stream tag used by codec driver, and this makes
controller don't get any capture data, then infinite XRUN happens in FW.

... and this causes DMA to fail receiving data, leading to unrecoverable
XRUN condition in FW.

@@ -330,7 +330,8 @@ static int hda_link_pcm_trigger(struct snd_pcm_substream *substream,
if (ret < 0)
return ret;
stream_tag = hdac_stream(link_dev)->stream_tag;
snd_hdac_ext_link_clear_stream_id(link, stream_tag);
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
snd_hdac_ext_link_clear_stream_id(link, stream_tag);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RanderWang for the explanation! The patch and commit message now look ok.
What confused me was the documentation (in hdac library) for snd_hdac_ext_link_clear_stream_id(). It doesn't describe that in HDA spec, this property is only valid for outputs. But yeah. outside this patch.

@RanderWang
Copy link
Author

Update my commit message, thanks!

ranj063
ranj063 previously approved these changes Jul 9, 2019
For this bug, there are two capture pcm streams active, with one
stream and its related stream tag released before suspend. Later
when system suspend is done, the stream tag for the remaining
active stream is released by SOF driver. After system resume, hda
codec driver restores the stream tag for the active pcm stream,
but SOF goes to assign a new one, which now doesn't match with the
stream tag used by codec driver, and this causes DMA to fail
receiving data, leading to unrecoverable XRUN condition in FW.

For stream tag is stored in both hda codec and SOF driver, it
shouldn't be released only in SOF driver. This patch just keeps the
stream information in dma data and checks whether there is a stored
DMA data for stream resuming from S3 and restores it. And it also
removes DMA data when the stream is released.

Tested on Whiskey Lake platform.

GitHub issue: thesofproject/sof#1594
Signed-off-by: Rander Wang <rander.wang@linux.intel.com>
snd_hdac_ext_link_clear_stream_id maps stream id to
link output, which is for playback, not capture.

Tested on Whiskey Lake platform.

Signed-off-by: Rander Wang <rander.wang@linux.intel.com>
Copy link

@keyonjie keyonjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks.

@ranj063
Copy link
Collaborator

ranj063 commented Jul 9, 2019

@kv2019i are you good with this PR?

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you @RanderWang , looks good now!

@ranj063 ranj063 merged commit 01373cf into thesofproject:topic/sof-dev Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants