Issues that arise when clients are running are the most difficult to pinpoint.
Some of these issues can be:
There are tools that allow you to investigate all of the above without having to instrument the client. You can also check slog2info for possible errors.
You can figure out the requests that are sent from the clients to the server by dumping the contents of the /dev/screen/requests file. This file may not be there if it's been disabled in graphics.conf, or if it has been explicitly disabled in the configuration file. There might be a lot of requests in that file. The requests logsize is usually set to 64 KB. You can change this number to be greater if you can't get to the file before requests of interest are flushed out. See Configure globals subsection of Configuring Screen for more details.
The /dev/screen/requests file is in binary format. It actually points to a ring buffer where messages are saved as they are received. To make sense of this binary format, you can use the screeninfo tool. See screeninfo under Screen Utilities for more details.
Here's the command you'd use:
# screeninfo /dev/screen/requests
and here's what you could see:
context=15;subtype=WIN_GET_EVENT;timeout=-1 [EOK] context=14;subtype=WIN_GET_EVENT;timeout=-1 [EOK] context=5;subtype=WIN_GET_EVENT;timeout=-1 [EOK] context=13;subtype=WIN_TRANSACT;parts={ subtype=WIN_SET_PROPERTY;win-14;SCREEN_PROPERTY_VISIBLE=1 [EOK] subtype=WIN_SET_PROPERTY;win-14;SCREEN_PROPERTY_ZORDER=1000 [EOK] subtype=WIN_SET_PROPERTY;win-17;SCREEN_PROPERTY_VISIBLE=0 [EOK] subtype=WIN_SET_PROPERTY;win-17;SCREEN_PROPERTY_ZORDER=0 [EOK] subtype=WIN_SET_PROPERTY;win-20;SCREEN_PROPERTY_VISIBLE=0 [EOK] subtype=WIN_SET_PROPERTY;win-20;SCREEN_PROPERTY_ZORDER=0 [EOK] subtype=WIN_FLUSH;blits=0;wait_idle=0 [EOK] } [EOK] context=13;subtype=WIN_GET_EVENT;timeout=-1 [EOK] context=13;subtype=WIN_TRANSACT;parts={ subtype=WIN_GET_PROPERTY;dpy-1;pname=SCREEN_PROPERTY_FOCUS [EOK] } [EOK] context=6;subtype=WIN_GET_EVENT;timeout=-1 [EOK] context=10;subtype=WIN_TRANSACT;parts={ subtype=WIN_GET_PROPERTY;ssn-51;pname=SCREEN_PROPERTY_MODE [EOK] } [EOK]
All the requests that come from all the different contexts currently connected to Screen are shown in /dev/screen/requests. The context that makes a request is identified before each request (e.g., context=15;subtype=WIN_GET_EVENT;). You'll see that some requests go out on their own while others are items in an atomic transaction. This is because some Screen API functions are immediate, delayed, or flushing. See "Function execution types". You'll also see that for each request, you can get an error status (e.g., [EOK]). If a client process is complaining that it's getting errors from its Screen API calls, you can certainly refer to this error status to figure out the offending function call.
If you spot an error in one of the requests, you'll know which request caused the error. However, the contents of /dev/screen/requests will not necessarily tell you why there was an error. In many cases you can figure it out just by looking at the request. In other cases, you'll have to dig some more.
If you're looking for a request that's not there, most likely the application hasn't made the call, or it's deferred and waiting to be flushed. You'd be surprised how often this is the case. When debugging an error in your application, it's a good idea to call screen_flush_context() after any API function which is of the type: delayed execution. Calling screen_flush_context() helps you to determine the exact function call that caused the error.
Client errors are not logged in the system log; they're returned to the clients. You can check the last error that was reported to a context by looking at the corresponding context file under /dev/screen. This is also a binary file, so you'll need to use the screeninfo utility to make sense of it:
# screeninfo /dev/screen/pid/ctx-context-id
where pid is the application's process ID and context-id is the ID of the context where the error occurred.
Here's what you could see:
ctx-7/6799484(winmgr) ================================================= id = screen-ctx-7-00000025-b18c4795d6b74be7bd5cec2fa6b999e6 id string = permissions = rwxrwxrwx--x--x--x--x--- display = dpy-1 status = CREATED process group id = 6799484 user id = 0 group id = 0 inject allowed = 1 flags = SCREEN_WINDOW_MANAGER_CONTEXT SCREEN_INPUT_PROVIDER_CONTEXT SCREEN_DISPLAY_MANAGER_CONTEXT queue size = 100 last activity = 1429793855189 touch focus = error log = In transaction item 1 out of 1: screen_get_window_property(win-16, pname=44) [48] The request is currently not supported, or not supported for that object type, or for that object in its current state.
From here, we can see that the request screen_get_window_property_iv(win-16, SCREEN_PROPERTY_STRIDE, ...) caused a ENOTSUP error. This happened because the application was trying get a property that didn't belong to a window; windows don't have strides, buffers do.
From a context file, you can also see the list of events that have already been sent to the context. For example:
# screeninfo /dev/screen/1495071/ctx-2 ctx-2/1495071(sw-vsync) ================================================= id = screen-ctx-2-00000010-e43907dd7f89e872b5c43c8849526d70 id string = permissions = rwx--x--x--x--x--x--x--- acl = (empty) display = dpy-2 status = CREATED process group id = 1495071 user id = 0 group id = 0 inject allowed = 1 flags = SCREEN_APPLICATION_CONTEXT queue size = 20 last activity = 1450383443524 touch focus = SCREEN_EVENT_PROPERTY;timestamp=1450383175349;win-0;pname=SCREEN_PROPERTY_FOCUS SCREEN_EVENT_PROPERTY;timestamp=1450383175349;win-0;pname=SCREEN_PROPERTY_STATUS SCREEN_EVENT_DEVICE;timestamp=1450383585914;dev-0;attached SCREEN_EVENT_DEVICE;timestamp=1450383637260;dev-0;detached SCREEN_EVENT_DEVICE;timestamp=1450383638457;dev-0;attached SCREEN_EVENT_DEVICE;timestamp=1450383638700;dev-0;detached
If it looks like it's the contents of a window that are wrong, there are ways to inspect the content of each buffer of every window. This service is provided as a BMP file under the folder that exists for each pixmap or stream that was created by each context.
Find the window that contains the error, and then find the stream that the window is using. Once you find the stream, have a look at its front buffer. First you have to figure out which is the front buffer for that window. To do this, you need to see the contents of the text file associated with the window.
The win-window_id file in /dev/screen/pid/win-window_id shows the values of the window parameters that Screen's currently using. When you have an application running, Screen creates a directory for each application window, and each of these directories contains files associated with the window.
For example, for process 1495071 with only one application window, you can use this command from your shell:
# cat /dev/screen/1495071/win-0
Here's part of the file you may see:
win-0/ctx-2/1495071(sw-vsync) ================================================= id = screen-win-0-00000011-dd6ba1aeacf03bc1755f9950f5aaad1c id string = sw-vsync permissions = rwx--x--x--x--x--x--x--- acl = (empty) display = dpy-2 status = FULLY VISIBLE type = SCREEN_APPLICATION_WINDOW autonomous = 0 references = 1 window manager string = pipeline id = 1 parent = (none) children = (none) window above = (none) window below = (none) alternate window = (none) root window = (none) shadow = (none) reclip = 0 updates = 0 locked = 0 valid = 0x00000018 class = flags = WIN_FLAG_VISIBLE WIN_FLAG_FLOATING buffer size = 1280x720 format = SCREEN_FORMAT_RGBX8888 color space = SCREEN_COLOR_SPACE_UNCORRECTED usage = SCREEN_USAGE_NATIVE order = 0 swap interval = 1 default = ssn-0 sessions = *ssn-0 streams = str-0 holes = (none) ...
From the above window example, you can see that the window has a stream, str-0. You can look under the str-0 directory to see associated information on the stream. For example:
# ls /dev/screen/1495071/str-0 str-0 str-0-1.bmp # cat /dev/screen/1495071/str-0/str-0 str-0/ctx-2/1495071(sw-vsync) ================================================= id = screen-str-0-00000013-4ce1a81eab75c607b80e317b90e26509 id string = permissions = rwx--x--x--x--x--x--x--- acl = (empty) display = dpy-2 status = INVISIBLE autonomous = 0 mode = (SCREEN_STREAM_MODE_FIFO) buffer count = 1 flags = (none) buffer size = 1280x720 format = SCREEN_FORMAT_RGBX8888 color space = SCREEN_COLOR_SPACE_UNCORRECTED usage = SCREEN_USAGE_NATIVE swap interval = 1 protection requested = (none) protection enabled = (none) owner = win-0/ctx-2/1495071(sw-vsync) refs = (self) front = 0 index=0 flags=0x0001 dirty=(880,0;881,720) back = [ 0 ] rcvids = (none) buffers = str-0-1 gd=0 nbytes=4194304 usage=0x0008 flags=0x8240 paddr=0x00000000 vaddr=0x40400000 cvaddr=0x00000000 dvaddr=0x00000001 wfd=0x00000000 cm=0x08893350 egl=0x00000000 native=0x080a95b0 metrics.cpu = 8589934590 metrics.gpu = 0 metrics.objcnt = 0 metrics.apicnt = 0 metrics.drawcnt = 0 metrics.tricnt = 0 metrics.vtxcnt = 0 metrics.teximg = 0 metrics.subdata = 0 metrics.blits.count = 23588 metrics.blits.pixels = 19 Mpixels metrics.blits.reads = 0 bytes metrics.blits.writes = 76 Mbytes metrics.posts.count = 11794 metrics.posts.pixels = 0 pixels
The above example tells us the stream's front buffer is index 0. The image to look for is one called str-0-1.bmp. It's important to keep in mind that the system is still running while you do all these operations. You can check the front buffer and then moments later it will be different. That's why it's a good idea to copy all of them. It's also possible for an application to render into the buffer while you're reading or copying it, which can cause tearing artifacts. Therefore, looking at the bitmaps isn't the best way to investigate tearing on the display.
If there are no issues with the buffer contents, or the problem isn't the contents but the position of the window on the screen, or it's the visibility, you can check the display file that we looked at before in the "At initialization" section. Pay special attention to the scene description to try to figure out what might be wrong.
dpy-2/ctx-0/0(screen) ================================================= ... modifiers = (none) scene = win-2 ctx-3 ssn-1 pid=1622052(sw-vsync) order=0 FULLY VISIBLE sw-vsync win-0 ctx-2 ssn-0 pid=1609762(gles2-gears) order=0 FULLY VISIBLE gles2-gears strategy = WIN_RENDER_CM bypass = 0 dirty = (none) windows = win-0 win-2 sessions = *ssn-1 win-2 ctx-3 pid=1622052(sw-vsync) type=NONE order=0 mode=default SCREEN_SENSITIVITY_TEST *ssn-0 win-0 ctx-2 pid=1609762(gles2-gears) type=NONE order=0 mode=default SCREEN_SENSITIVITY_TEST background = (none) holes = (none) ...
The scene tells you all the windows that are marked as visible on the display. In the above example, there are 2 visible windows on the display (two applications displaying one window each). Windows that aren't realized or haven't posted a front buffer yet aren't in the scene. The list is sorted in descending order. The windows with higher z-order are listed first. The windows at the bottom of the list are lower in z-order. If your window isn't there, check that it has buffers, that it's not marked as invisible, or that it isn't parented by an invisible window. Note that covered windows are part of the scene. The windows list enumerates all the windows that are composited to a framebuffer (windows = win-0 win-2). If a window isn't there, it's because it's going straight to a pipeline, or it's invisible because it's covered by another window, or the window is clipped, or the window's transparency is SCREEN_TRANSPARENCY_DISCARD. Don't forget that windows are drawn from bottom to top, so this is the order in which the windows are presented in the windows list. You can check the pipelines to see which windows might be showing on a pipeline.
Now, let's say we position the window of one of the applications (sw-vsync) to be outside the bounds of the display. Here's an example what the scene could look like:
dpy-2/ctx-0/0(screen) ================================================= ... modifiers = (none) scene = win-2 ctx-3 ssn-1 pid=1622052(sw-vsync) order=0 CLIPPED sw-vsync win-0 ctx-2 ssn-0 pid=1609762(gles2-gears) order=0 FULLY VISIBLE gles2-gears strategy = WIN_RENDER_CM bypass = 0 dirty = (none) windows = win-0 sessions = *ssn-1 win-2 ctx-3 pid=1622052(sw-vsync) type=NONE order=0 mode=default SCREEN_SENSITIVITY_TEST *ssn-0 win-0 ctx-2 pid=1609762(gles2-gears) type=NONE order=0 mode=default SCREEN_SENSITIVITY_TEST background = (none) holes = (none) ...
In this second example of a scene, you can see that there are still 2 windows in the scene. However, one window is clipped (i.e., not visible). The clipped window is still a valid render target, but it isn't visible on the display. Therefore, the clipped window isn't composited to the framebuffer, and that's why it isn't in the windows list (windows = win-0).
When we look at a scene, most of this state is controlled by the properties of the various windows. If something's not right, the next thing to do is go look at the relevant windows and see what their properties are. The files associated with the windows provide values for the different properties and they also provides the internal state of the windows.
If everything from the windows perspective looks good, the problem may first appear in the framebuffer (i.e., the problem may be compositing into the framebuffer). You can look at the buffers of the framebuffer. Because the framebuffer is a window just like any other window, you can use all the tools and files we've discussed in the earlier sections to check the framebuffer. If the contents of the framebuffer don't look quite right, you can dig into all the operations that went into generating the contents of the framebuffer. The /dev/screen/0/blt-[id] file contains all the operations performed on the framebuffer for a display. This is a binary ring buffer, so you must use screeninfo to decipher it:
# screeninfo /dev/screen/0/blt-1
and here's what you could see:
4252> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4252> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4253> fill[255](0, 0, 768, 60, #ff000000) [ok] 4253> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4253> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4253> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4254> fill[255](0, 0, 768, 60, #ff000000) [ok] 4254> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4254> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4254> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4255> fill[255](0, 0, 768, 60, #ff000000) [ok] 4255> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4255> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4255> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4256> fill[255](0, 0, 768, 60, #ff000000) [ok] 4256> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4256> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4256> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4257> fill[255](0, 0, 768, 60, #ff000000) [ok] 4257> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4257> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok] 4257> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok] 4258> fill[255](0, 0, 768, 60, #ff000000) [ok] 4258> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok] 4258> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
Each line starts with a frame number, so you can see which series of operations belong to the same update. Just like for the requests, there is an error status. In some rare cases, you'll see an error, which explains why the display isn't showing the right thing. In most cases, errors in this file will also have corresponding messages in the system log, which may provide additional information.
Similar to the requests file, this file can be disabled, so you must ensure that graphics.conf doesn't set its size to zero if you need it. Normally, a size of 4 KB to 8 KB is sufficient to capture several hundreds of operations. If you need to look further back in time, you can make it bigger. See Configure globals subsection of Configuring Screen for more details.