Intel® oneAPI Base Toolkit
Support for the core tools and libraries within the base toolkit that are used to build and deploy high-performance data-centric applications.
419 Discussions

Issues Running aio.h for Windows Demo Prorgams - Particularly with FILE_FLAG_OVERLAPPED

connorb
Beginner
556 Views

I preferred to have too much info than not enough, but hopefully it's not overwhelming.

Also, my original attempt to submit reported my message was changed because it contained unsupported HTML - everything appears to be here but hopefully nothing was modified that I don't see

Goal

Build and run a program demonstrating async IO capabilities on Windows using Intel oneAPI's aio.h interface. This means, a program using the aio_* routines with a file flagged with FILE_FLAG_OVERLAPPED.

Approach

Use the demo programs provided in the "Intel's C++ Asynchronous I/O Extensions for Windows" section of the *"Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference" to achieve goal.

*To remove ambiguity in different versions of the document, I provided the PDF of the developer guide I used in the attached zip (and the content I referenced comes from pages 1863 through 1876).

Problem

Main problem - All demo programs result in zero-data-transfer read and write operations when the FILE_FLAG_OVERLAPPED flag is set. That is, a given demo program can interact with files and perform meaningful write/read operations (without the FILE_FLAG_OVERLAPPED flag), but if the same code is re-compiled and run with adding the FILE_FLAG_OVERLAPPED flag, those same operations result in "0 bytes [read/written]".

Secondary problem(s) - A number of the demo programs do not result in the expected output provided in the "Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference" (with or without the FILE_FLAG_OVERLAPPED flag).

 

Since there are several demo programs, collectively demonstrating a variety of issues, I've left the details of the issues observed to the in-line comments in the PowerShell script (in the attached zip) which I used to demonstrate my compile-and-run steps.

I am most interested in resolving the main problem, but I've provided info on the secondary problem(s) observed in case that extra information helps illuminate what my issue is.

Side note

I'm also a little confused as to why the developer guide says (page 1864):

NOTE
The POSIX AIO library and the Microsoft SDK provide similar AIO functions. The main difference between the POSIX AIO functions and the Windows operating system-based AIO functions is that while POSIX allows you to execute AIO operations with any file, the Windows operating system executes AIO operations only with files flagged with FILE_FLAG_OVERLAPPED.

and then proceeds to show two demo programs (pages 1865-1868) which it prefaces with:

Example for aio_read and aio_write Functions
The example illustrates the performance gain of the asynchronous I/O usage in comparison with synchronous I/O usage. In the example, 5.6 MB of data is asynchronously written with the main program computation, which is the scalar multiplication of two vectors with some normalization.

while providing source that does not contain the FILE_FLAG_OVERLAPPED flag (while the other demo programs explicitly provide source which indicates FILE_FLAG_OVERLAPPED can optionally be used).

Granted, this is a pretty minor point as I think it's a pretty easy leap to assume you need to add the FILE_FLAG_OVERLAPPED flag to observe this effect, but it seems weird to me that the source provided doesn't do what the preface states it does, and adding the FILE_FLAG_OVERLAPPED flag to these programs (for me) leads to the point addressed in the "main problem" above, so I'm not sure where the misunderstanding lies here.

Environment

Here is some information about my system and software used (hopefully I captured everything):

  • OS
    • "Windows 10 Enterprise 22H2"
  • IDE/Command Line

    • I don't have Visual Studio, but I followed the Intel oneAPI installation instructions (I believe this version) with installing alongside MS Build Tools.

    • I have variously used cmd, PowerShell, and terminal sessions through VSCode (properly calling intel_setvars.bat or "Intel oneAPI: Initialize default environment variables", where appropriate) for testing things with no difference observed.
    • I provide outputs of software versions found when compiling/running the programs at the top of the log files in the attached zip, but here's the information provided from the oneAPI installation log:

      • intel.oneapi.win.basekit.product --product-ver 2023.2.0-49385
      • Product will be (un)integrated with the following IDEs:
      • Microsoft C++ Build Tools* 2022
      • End of IDE list.
      •  
  • Compiler, flags, etc.
    • In the attached zip, I've provided the source files I created, the exact PDF reference I pulled them from, two log files (one for trying 'icl', another for 'icx') from testing these programs, a PowerShell script that I setup to demonstrate these programs, and in-line comments in the PowerShell script detailing relevant pieces of information at each point in the setup/demonstration of these programs (this includes comments on flags, outputs/behavior observed, etc.).

Setup

To achieve the goal, my hope was to compile and run demo programs in which I did not contribute anything to the source to eliminate the possibility of errors on my end. In total, the source files I provide constitute 6 demo programs (all from the referenced/attached developer guide), but in each case it seems there was at least one part of the source I needed to contribute to.

Source - Overview and Primary Source Contributions I Made

There are four *.c files, each of which consist (primarily) of the source provided in the developer guide. For each function I pulled from the guide, I put a comment indicating the pages I found it on. Here's an overview of those files, with the main parts "I touched" highlighted:

  • do_compute.c
    • Auxiliary 'do_compute' function to generate data for IO operations
    • No modification on my part, although I did notice that there seems to be a typo as the "xB" variable is never actually used
  • aio_sample.c, aio_sample2.c
    • Each of these files contains a wholly-provided demo program (written in a main()) from the developer guide
    • These demo programs do not explicitly include FILE_FLAG_OVERLAPPED (i.e., I added the option to use this flag), so I suppose I added it in a context that might not make sense
    • Data files generated/used:
      • aio_sample.c => "aio.dat"
      • aio_sample2.c => "aio_sample2.dat"
  • aio_ex_i.c
    • This file consists of the aio_ex_2, ..., aio_ex_5 functions provided in the guide, but they were provided as stand-alone functions so I had to write my own main() driver
      • I believe I set it up in a sensible way, but given the weird behavior I observed, I definitely could be misunderstanding something
    • These demo programs do contain FILE_FLAG_OVERLAPPED in the source provided, but they appear as:
          FILE_ATTRIBUTE_NORMAL/*|FILE_FLAG_OVERLAPPED*/
      To me, this indicates the same source should work equally well with the
          /*|FILE_FLAG_OVERLAPPED*/
      portion commented/uncommented, but still these programs lead to "0 bytes [read/written]" operations when using the FILE_FLAG_OVERLAPPED flag
    • Data files generated/used:
      • aio_ex_2, ..., aio_ex_5 => "dat"

Source - Other Source Contributions I Made

Above, I highlighted what I felt were the most important parts of the modifications of the provided demo source I made, but there are a few others so I note them here for completeness:

  • All
    • To enable switching between the code using and not-using the FILE_FLAG_OVERLAPPED flag, I wrapped the CreateFile calls in an ifdef block to allow for changing the behavior at compile time (the rest of the contents of the CreateFile calls should be untouched though)
    • Along with this, I added printf calls to indicate which option is being used when the program is run
  • aio_sample2.c
    • I added the "#include <stdio.h>" directive as the 'icx' compilation complained about implicit declaration of printf
  • aio_ex_i.c
    • I needed to add an extra argument to the do_compute call in aio_ex_2 as the statement given in the demo code of aio_ex_2 disagrees with the definition given for the do_compute  function

Summary

To end, I'll reiterate the part of the "Problem" section that I think is most important.

I'm most interested in figuring out why the use of the FILE_FLAG_OVERLAPPED flag seems to result in "0 bytes [read/written]" results.

I setup a PowerShell script (provided in the zip) that walks through the various compilations and runs I've done to test/inspect this behavior, and details of the various things I've noticed are mentioned in-line in that file (see demo.ps1 in attached zip). Hopefully the setup of various compile-and-run steps of the script isn't confusing, but I'll quickly summarize what it does here in case that helps:

  1. First, Each program is compiled and run as the async (FILE_FLAG_OVERLAPPED) version. Before each compile-and-run, the directory is cleaned out of all built source files and any data files. These runs do result in the expected files to be created from running each program, but they are always of size zero
  2. Then, each program is then re-run as follows:
    1. The directory is cleaned of previously built source and data files
    2. The synchronous version (non-FILE_FLAG_OVERLAPPED) of a target demo program is compiled and run
      1. This results in creating, and successfully writing to, the associated data file (at least, for those demo programs which run without any "secondary issues")
    3. The directory is then cleaned of only previously built source (i.e., the successfully generated data files are kept in the directory)
    4. The async (FILE_FLAG_OVERLAPPED) version is compiled and run
      1. The hope here was to isolate whether it was just the write operations that were failing (since the programs at least create the files)
      2. However, the "0 bytes read" results are still observed, and the previously generated data files are not modified (i.e., they aren't re-written, written as empty, the "Last Write Time" reported by 'ls' isn't updated, etc.)

Edit

In one of my revisions in writing this post, it seems I removed the part stating how to run the PowerShell script so I'm adding it back for completeness:

.\demo.ps1 [icl|icx]

 that is, it takes a single argument 'icl' or 'icx' to flag which compiler to use.

Labels (1)
0 Kudos
1 Reply
Alina_S_Intel
Employee
474 Views

Thank you for your request. We will continue working on it on our support portal, 06069235. 

This forum thread will no longer be monitored. 

0 Kudos
Reply