How To Optimize Storage Of NSAttributedString In Swift Using Data And Codable?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







9















I am trying to optimize storage space when saving the contents of a NSTextView, namely its NSTextStorage property, itself a NSAttributedString.



Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.



Side note: It is even worse when the Data object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data into a simple wrapper structure.



Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.



I would like to figure out what is the proper way to efficiently store NSAttributedString using the Codable protocol. Any hint or advice is much appreciated.



I am also wondering whether there is a valid binary encoding option for Codable.










share|improve this question































    9















    I am trying to optimize storage space when saving the contents of a NSTextView, namely its NSTextStorage property, itself a NSAttributedString.



    Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.



    Side note: It is even worse when the Data object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data into a simple wrapper structure.



    Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.



    I would like to figure out what is the proper way to efficiently store NSAttributedString using the Codable protocol. Any hint or advice is much appreciated.



    I am also wondering whether there is a valid binary encoding option for Codable.










    share|improve this question



























      9












      9








      9


      0






      I am trying to optimize storage space when saving the contents of a NSTextView, namely its NSTextStorage property, itself a NSAttributedString.



      Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.



      Side note: It is even worse when the Data object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data into a simple wrapper structure.



      Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.



      I would like to figure out what is the proper way to efficiently store NSAttributedString using the Codable protocol. Any hint or advice is much appreciated.



      I am also wondering whether there is a valid binary encoding option for Codable.










      share|improve this question
















      I am trying to optimize storage space when saving the contents of a NSTextView, namely its NSTextStorage property, itself a NSAttributedString.



      Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.



      Side note: It is even worse when the Data object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data into a simple wrapper structure.



      Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.



      I would like to figure out what is the proper way to efficiently store NSAttributedString using the Codable protocol. Any hint or advice is much appreciated.



      I am also wondering whether there is a valid binary encoding option for Codable.







      swift codable






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Feb 13 at 0:06









      Dávid Pásztor

      23.4k83152




      23.4k83152










      asked Nov 24 '18 at 19:58









      jmdecombejmdecombe

      622614




      622614
























          1 Answer
          1






          active

          oldest

          votes


















          6





          +50









          TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.



          NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:




          1. Choosing the supported format that works best for your use cases and has the smallest footprint.


          OR




          1. Writing your own format.


          Approach 1: RTFD



          Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".




          Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.




          To understand what is happening here, try out the following code:



          do {
          let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
          rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
          } catch {
          print("(error)")
          }


          When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".



          The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.



          I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.



          Using JPEG instead



          Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.



          For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.



          In code it was even simpler. You may be able to get away with just changing how you add image attachments.



          // Don't do this unless you want PNG
          let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
          let attachment = NSTextAttachment()
          attachment.image = image

          // Do this if you want smaller files
          let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
          let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG


          Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.



          Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.



          Approach 2: Custom Format



          The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.



          There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.




          I am also wondering whether there is a valid binary encoding option for Codable.




          First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.



          Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.



          For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.



          As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.



          Aside: Other Formats



          Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.





          • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)


          • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)


          • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)


          • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)


          • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)


          • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)


          • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)


          • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)


          • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)


          • Mac Simple Text seems to always generate an error. (N/A)


          Final Note



          In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.






          share|improve this answer


























          • Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

            – Qbyte
            Feb 12 at 22:28











          • @Qbyte Yep, I added an aside to my answer with other formats.

            – Matthew Seaman
            Feb 12 at 23:23













          • Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

            – jmdecombe
            Mar 29 at 15:52












          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461875%2fhow-to-optimize-storage-of-nsattributedstring-in-swift-using-data-and-codable%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          6





          +50









          TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.



          NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:




          1. Choosing the supported format that works best for your use cases and has the smallest footprint.


          OR




          1. Writing your own format.


          Approach 1: RTFD



          Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".




          Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.




          To understand what is happening here, try out the following code:



          do {
          let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
          rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
          } catch {
          print("(error)")
          }


          When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".



          The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.



          I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.



          Using JPEG instead



          Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.



          For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.



          In code it was even simpler. You may be able to get away with just changing how you add image attachments.



          // Don't do this unless you want PNG
          let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
          let attachment = NSTextAttachment()
          attachment.image = image

          // Do this if you want smaller files
          let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
          let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG


          Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.



          Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.



          Approach 2: Custom Format



          The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.



          There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.




          I am also wondering whether there is a valid binary encoding option for Codable.




          First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.



          Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.



          For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.



          As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.



          Aside: Other Formats



          Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.





          • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)


          • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)


          • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)


          • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)


          • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)


          • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)


          • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)


          • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)


          • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)


          • Mac Simple Text seems to always generate an error. (N/A)


          Final Note



          In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.






          share|improve this answer


























          • Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

            – Qbyte
            Feb 12 at 22:28











          • @Qbyte Yep, I added an aside to my answer with other formats.

            – Matthew Seaman
            Feb 12 at 23:23













          • Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

            – jmdecombe
            Mar 29 at 15:52
















          6





          +50









          TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.



          NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:




          1. Choosing the supported format that works best for your use cases and has the smallest footprint.


          OR




          1. Writing your own format.


          Approach 1: RTFD



          Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".




          Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.




          To understand what is happening here, try out the following code:



          do {
          let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
          rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
          } catch {
          print("(error)")
          }


          When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".



          The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.



          I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.



          Using JPEG instead



          Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.



          For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.



          In code it was even simpler. You may be able to get away with just changing how you add image attachments.



          // Don't do this unless you want PNG
          let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
          let attachment = NSTextAttachment()
          attachment.image = image

          // Do this if you want smaller files
          let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
          let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG


          Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.



          Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.



          Approach 2: Custom Format



          The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.



          There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.




          I am also wondering whether there is a valid binary encoding option for Codable.




          First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.



          Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.



          For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.



          As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.



          Aside: Other Formats



          Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.





          • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)


          • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)


          • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)


          • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)


          • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)


          • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)


          • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)


          • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)


          • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)


          • Mac Simple Text seems to always generate an error. (N/A)


          Final Note



          In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.






          share|improve this answer


























          • Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

            – Qbyte
            Feb 12 at 22:28











          • @Qbyte Yep, I added an aside to my answer with other formats.

            – Matthew Seaman
            Feb 12 at 23:23













          • Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

            – jmdecombe
            Mar 29 at 15:52














          6





          +50







          6





          +50



          6




          +50





          TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.



          NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:




          1. Choosing the supported format that works best for your use cases and has the smallest footprint.


          OR




          1. Writing your own format.


          Approach 1: RTFD



          Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".




          Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.




          To understand what is happening here, try out the following code:



          do {
          let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
          rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
          } catch {
          print("(error)")
          }


          When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".



          The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.



          I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.



          Using JPEG instead



          Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.



          For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.



          In code it was even simpler. You may be able to get away with just changing how you add image attachments.



          // Don't do this unless you want PNG
          let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
          let attachment = NSTextAttachment()
          attachment.image = image

          // Do this if you want smaller files
          let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
          let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG


          Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.



          Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.



          Approach 2: Custom Format



          The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.



          There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.




          I am also wondering whether there is a valid binary encoding option for Codable.




          First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.



          Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.



          For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.



          As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.



          Aside: Other Formats



          Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.





          • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)


          • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)


          • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)


          • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)


          • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)


          • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)


          • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)


          • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)


          • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)


          • Mac Simple Text seems to always generate an error. (N/A)


          Final Note



          In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.






          share|improve this answer















          TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.



          NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:




          1. Choosing the supported format that works best for your use cases and has the smallest footprint.


          OR




          1. Writing your own format.


          Approach 1: RTFD



          Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".




          Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.




          To understand what is happening here, try out the following code:



          do {
          let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
          rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
          } catch {
          print("(error)")
          }


          When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".



          The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.



          I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.



          Using JPEG instead



          Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.



          For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.



          In code it was even simpler. You may be able to get away with just changing how you add image attachments.



          // Don't do this unless you want PNG
          let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
          let attachment = NSTextAttachment()
          attachment.image = image

          // Do this if you want smaller files
          let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
          let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG


          Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.



          Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.



          Approach 2: Custom Format



          The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.



          There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.




          I am also wondering whether there is a valid binary encoding option for Codable.




          First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.



          Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.



          For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.



          As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.



          Aside: Other Formats



          Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.





          • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)


          • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)


          • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)


          • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)


          • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)


          • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)


          • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)


          • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)


          • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)


          • Mac Simple Text seems to always generate an error. (N/A)


          Final Note



          In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Feb 12 at 23:29

























          answered Feb 12 at 21:33









          Matthew SeamanMatthew Seaman

          5,82522742




          5,82522742













          • Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

            – Qbyte
            Feb 12 at 22:28











          • @Qbyte Yep, I added an aside to my answer with other formats.

            – Matthew Seaman
            Feb 12 at 23:23













          • Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

            – jmdecombe
            Mar 29 at 15:52



















          • Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

            – Qbyte
            Feb 12 at 22:28











          • @Qbyte Yep, I added an aside to my answer with other formats.

            – Matthew Seaman
            Feb 12 at 23:23













          • Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

            – jmdecombe
            Mar 29 at 15:52

















          Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

          – Qbyte
          Feb 12 at 22:28





          Thank you very much for your detailed answer! Do you also know a good/the best way to save NSAttributedStrings which do not contain images with the least amount of memory usage?

          – Qbyte
          Feb 12 at 22:28













          @Qbyte Yep, I added an aside to my answer with other formats.

          – Matthew Seaman
          Feb 12 at 23:23







          @Qbyte Yep, I added an aside to my answer with other formats.

          – Matthew Seaman
          Feb 12 at 23:23















          Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

          – jmdecombe
          Mar 29 at 15:52





          Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.

          – jmdecombe
          Mar 29 at 15:52




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461875%2fhow-to-optimize-storage-of-nsattributedstring-in-swift-using-data-and-codable%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Tangent Lines Diagram Along Smooth Curve

          Yusuf al-Mu'taman ibn Hud

          Zucchini