How To Optimize Storage Of NSAttributedString In Swift Using Data And Codable?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am trying to optimize storage space when saving the contents of a NSTextView
, namely its NSTextStorage
property, itself a NSAttributedString
.
Saving it as Data
, for example using the rtfd(from:documentAttributes:)
method, and as part of a Codable
structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView
. For example, inserting a 200K image will result in a 5MB JSON file.
Side note: It is even worse when the Data
object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data
into a simple wrapper structure.
Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.
I would like to figure out what is the proper way to efficiently store NSAttributedString
using the Codable
protocol. Any hint or advice is much appreciated.
I am also wondering whether there is a valid binary encoding option for Codable
.
swift codable
add a comment |
I am trying to optimize storage space when saving the contents of a NSTextView
, namely its NSTextStorage
property, itself a NSAttributedString
.
Saving it as Data
, for example using the rtfd(from:documentAttributes:)
method, and as part of a Codable
structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView
. For example, inserting a 200K image will result in a 5MB JSON file.
Side note: It is even worse when the Data
object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data
into a simple wrapper structure.
Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.
I would like to figure out what is the proper way to efficiently store NSAttributedString
using the Codable
protocol. Any hint or advice is much appreciated.
I am also wondering whether there is a valid binary encoding option for Codable
.
swift codable
add a comment |
I am trying to optimize storage space when saving the contents of a NSTextView
, namely its NSTextStorage
property, itself a NSAttributedString
.
Saving it as Data
, for example using the rtfd(from:documentAttributes:)
method, and as part of a Codable
structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView
. For example, inserting a 200K image will result in a 5MB JSON file.
Side note: It is even worse when the Data
object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data
into a simple wrapper structure.
Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.
I would like to figure out what is the proper way to efficiently store NSAttributedString
using the Codable
protocol. Any hint or advice is much appreciated.
I am also wondering whether there is a valid binary encoding option for Codable
.
swift codable
I am trying to optimize storage space when saving the contents of a NSTextView
, namely its NSTextStorage
property, itself a NSAttributedString
.
Saving it as Data
, for example using the rtfd(from:documentAttributes:)
method, and as part of a Codable
structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView
. For example, inserting a 200K image will result in a 5MB JSON file.
Side note: It is even worse when the Data
object is encoded directly rather than as a property of the encoded object, as it is encoded in the form of an array of small integers rather than an arbitrary string. I am not sure why, though I was able to prevent this by inserting the Data
into a simple wrapper structure.
Strangely, compressing the actual JSON file using ZIP still results in a 4MB file, merely a 20% gain, so it is unclear to me how a 200K image could turn into such a massive, hardly compressable encoded string.
I would like to figure out what is the proper way to efficiently store NSAttributedString
using the Codable
protocol. Any hint or advice is much appreciated.
I am also wondering whether there is a valid binary encoding option for Codable
.
swift codable
swift codable
edited Feb 13 at 0:06
Dávid Pásztor
23.4k83152
23.4k83152
asked Nov 24 '18 at 19:58
jmdecombejmdecombe
622614
622614
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.
NSAttributedString
can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:
- Choosing the supported format that works best for your use cases and has the smallest footprint.
OR
- Writing your own format.
Approach 1: RTFD
Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".
Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.
To understand what is happening here, try out the following code:
do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("(error)")
}
When you call rtfd(from:documentAttributes:)
, you're getting flat Data
. This flat data can then be encoded somewhere and read back into NSAttributedString
. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:)
, and writing that to a URL
with the rtfd
extension, we can see the actual package format that rtfd(from:documentAttributes:)
replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".
The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:)
or rtfd(from:documentAttributes:)
, any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage
before getting wrapped in a NSTextAttachment
. The NSImage
is able to write the image data out in other formats, including larger formats like PNG.
I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString
wrote it to RTFD as PNG.
Using JPEG
instead
Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg
images.
For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png
(even though it is still a JPG) and it accepted it.
In code it was even simpler. You may be able to get away with just changing how you add image attachments.
// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image
// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG
Then when you create a new NSAttributedString
with that NSTextAttachment
and append it to NSTextStorage
, writing RTFD data will be signifantly smaller.
Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.
Approach 2: Custom Format
The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.
There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.
I am also wondering whether there is a valid binary encoding option for Codable.
First, note that NSAttributedString
is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding
instead of Codable
.
Note that you cannot extend NSAttributedString
to conform to Codable
, because the init(from:)
requirement on Decodable
can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final
, that means it can only be satisfied by a required init
. Required initializers can only be specified on the original declaration, not extensions.
For this reason, if you wanted to conform it to Codable
, you would need to use a wrapper object. enumerateAttributes(in:options:using:)
should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.
As for encoding in binary, Codable
is completely agnostic to format, so you could write your own object conforming to Coder
that does whatever you want, including store everything using raw bytes.
Aside: Other Formats
Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!"
in the system font. After each format description (in parentheses) is the number of bytes to store that string.
Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML byNSAttributedString
. (536 bytes)
Binary Plist, which is made when you useNSKeyedArchiver
, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
Mac Simple Text seems to always generate an error. (N/A)
Final Note
In the end, the encoding experience for NSAttributedString
should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString
or some similar Swifty type conforms to Codable
out of the box and can then be paired with any file format Coder
.
Thank you very much for your detailed answer! Do you also know a good/the best way to saveNSAttributedString
s which do not contain images with the least amount of memory usage?
– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461875%2fhow-to-optimize-storage-of-nsattributedstring-in-swift-using-data-and-codable%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.
NSAttributedString
can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:
- Choosing the supported format that works best for your use cases and has the smallest footprint.
OR
- Writing your own format.
Approach 1: RTFD
Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".
Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.
To understand what is happening here, try out the following code:
do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("(error)")
}
When you call rtfd(from:documentAttributes:)
, you're getting flat Data
. This flat data can then be encoded somewhere and read back into NSAttributedString
. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:)
, and writing that to a URL
with the rtfd
extension, we can see the actual package format that rtfd(from:documentAttributes:)
replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".
The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:)
or rtfd(from:documentAttributes:)
, any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage
before getting wrapped in a NSTextAttachment
. The NSImage
is able to write the image data out in other formats, including larger formats like PNG.
I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString
wrote it to RTFD as PNG.
Using JPEG
instead
Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg
images.
For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png
(even though it is still a JPG) and it accepted it.
In code it was even simpler. You may be able to get away with just changing how you add image attachments.
// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image
// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG
Then when you create a new NSAttributedString
with that NSTextAttachment
and append it to NSTextStorage
, writing RTFD data will be signifantly smaller.
Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.
Approach 2: Custom Format
The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.
There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.
I am also wondering whether there is a valid binary encoding option for Codable.
First, note that NSAttributedString
is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding
instead of Codable
.
Note that you cannot extend NSAttributedString
to conform to Codable
, because the init(from:)
requirement on Decodable
can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final
, that means it can only be satisfied by a required init
. Required initializers can only be specified on the original declaration, not extensions.
For this reason, if you wanted to conform it to Codable
, you would need to use a wrapper object. enumerateAttributes(in:options:using:)
should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.
As for encoding in binary, Codable
is completely agnostic to format, so you could write your own object conforming to Coder
that does whatever you want, including store everything using raw bytes.
Aside: Other Formats
Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!"
in the system font. After each format description (in parentheses) is the number of bytes to store that string.
Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML byNSAttributedString
. (536 bytes)
Binary Plist, which is made when you useNSKeyedArchiver
, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
Mac Simple Text seems to always generate an error. (N/A)
Final Note
In the end, the encoding experience for NSAttributedString
should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString
or some similar Swifty type conforms to Codable
out of the box and can then be paired with any file format Coder
.
Thank you very much for your detailed answer! Do you also know a good/the best way to saveNSAttributedString
s which do not contain images with the least amount of memory usage?
– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
add a comment |
TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.
NSAttributedString
can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:
- Choosing the supported format that works best for your use cases and has the smallest footprint.
OR
- Writing your own format.
Approach 1: RTFD
Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".
Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.
To understand what is happening here, try out the following code:
do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("(error)")
}
When you call rtfd(from:documentAttributes:)
, you're getting flat Data
. This flat data can then be encoded somewhere and read back into NSAttributedString
. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:)
, and writing that to a URL
with the rtfd
extension, we can see the actual package format that rtfd(from:documentAttributes:)
replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".
The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:)
or rtfd(from:documentAttributes:)
, any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage
before getting wrapped in a NSTextAttachment
. The NSImage
is able to write the image data out in other formats, including larger formats like PNG.
I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString
wrote it to RTFD as PNG.
Using JPEG
instead
Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg
images.
For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png
(even though it is still a JPG) and it accepted it.
In code it was even simpler. You may be able to get away with just changing how you add image attachments.
// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image
// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG
Then when you create a new NSAttributedString
with that NSTextAttachment
and append it to NSTextStorage
, writing RTFD data will be signifantly smaller.
Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.
Approach 2: Custom Format
The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.
There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.
I am also wondering whether there is a valid binary encoding option for Codable.
First, note that NSAttributedString
is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding
instead of Codable
.
Note that you cannot extend NSAttributedString
to conform to Codable
, because the init(from:)
requirement on Decodable
can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final
, that means it can only be satisfied by a required init
. Required initializers can only be specified on the original declaration, not extensions.
For this reason, if you wanted to conform it to Codable
, you would need to use a wrapper object. enumerateAttributes(in:options:using:)
should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.
As for encoding in binary, Codable
is completely agnostic to format, so you could write your own object conforming to Coder
that does whatever you want, including store everything using raw bytes.
Aside: Other Formats
Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!"
in the system font. After each format description (in parentheses) is the number of bytes to store that string.
Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML byNSAttributedString
. (536 bytes)
Binary Plist, which is made when you useNSKeyedArchiver
, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
Mac Simple Text seems to always generate an error. (N/A)
Final Note
In the end, the encoding experience for NSAttributedString
should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString
or some similar Swifty type conforms to Codable
out of the box and can then be paired with any file format Coder
.
Thank you very much for your detailed answer! Do you also know a good/the best way to saveNSAttributedString
s which do not contain images with the least amount of memory usage?
– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
add a comment |
TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.
NSAttributedString
can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:
- Choosing the supported format that works best for your use cases and has the smallest footprint.
OR
- Writing your own format.
Approach 1: RTFD
Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".
Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.
To understand what is happening here, try out the following code:
do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("(error)")
}
When you call rtfd(from:documentAttributes:)
, you're getting flat Data
. This flat data can then be encoded somewhere and read back into NSAttributedString
. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:)
, and writing that to a URL
with the rtfd
extension, we can see the actual package format that rtfd(from:documentAttributes:)
replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".
The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:)
or rtfd(from:documentAttributes:)
, any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage
before getting wrapped in a NSTextAttachment
. The NSImage
is able to write the image data out in other formats, including larger formats like PNG.
I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString
wrote it to RTFD as PNG.
Using JPEG
instead
Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg
images.
For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png
(even though it is still a JPG) and it accepted it.
In code it was even simpler. You may be able to get away with just changing how you add image attachments.
// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image
// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG
Then when you create a new NSAttributedString
with that NSTextAttachment
and append it to NSTextStorage
, writing RTFD data will be signifantly smaller.
Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.
Approach 2: Custom Format
The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.
There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.
I am also wondering whether there is a valid binary encoding option for Codable.
First, note that NSAttributedString
is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding
instead of Codable
.
Note that you cannot extend NSAttributedString
to conform to Codable
, because the init(from:)
requirement on Decodable
can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final
, that means it can only be satisfied by a required init
. Required initializers can only be specified on the original declaration, not extensions.
For this reason, if you wanted to conform it to Codable
, you would need to use a wrapper object. enumerateAttributes(in:options:using:)
should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.
As for encoding in binary, Codable
is completely agnostic to format, so you could write your own object conforming to Coder
that does whatever you want, including store everything using raw bytes.
Aside: Other Formats
Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!"
in the system font. After each format description (in parentheses) is the number of bytes to store that string.
Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML byNSAttributedString
. (536 bytes)
Binary Plist, which is made when you useNSKeyedArchiver
, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
Mac Simple Text seems to always generate an error. (N/A)
Final Note
In the end, the encoding experience for NSAttributedString
should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString
or some similar Swifty type conforms to Codable
out of the box and can then be paired with any file format Coder
.
TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.
NSAttributedString
can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:
- Choosing the supported format that works best for your use cases and has the smallest footprint.
OR
- Writing your own format.
Approach 1: RTFD
Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".
Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.
To understand what is happening here, try out the following code:
do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("(error)")
}
When you call rtfd(from:documentAttributes:)
, you're getting flat Data
. This flat data can then be encoded somewhere and read back into NSAttributedString
. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:)
, and writing that to a URL
with the rtfd
extension, we can see the actual package format that rtfd(from:documentAttributes:)
replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".
The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:)
or rtfd(from:documentAttributes:)
, any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage
before getting wrapped in a NSTextAttachment
. The NSImage
is able to write the image data out in other formats, including larger formats like PNG.
I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString
wrote it to RTFD as PNG.
Using JPEG
instead
Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg
images.
For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png
(even though it is still a JPG) and it accepted it.
In code it was even simpler. You may be able to get away with just changing how you add image attachments.
// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image
// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG
Then when you create a new NSAttributedString
with that NSTextAttachment
and append it to NSTextStorage
, writing RTFD data will be signifantly smaller.
Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.
Approach 2: Custom Format
The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.
There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.
I am also wondering whether there is a valid binary encoding option for Codable.
First, note that NSAttributedString
is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding
instead of Codable
.
Note that you cannot extend NSAttributedString
to conform to Codable
, because the init(from:)
requirement on Decodable
can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final
, that means it can only be satisfied by a required init
. Required initializers can only be specified on the original declaration, not extensions.
For this reason, if you wanted to conform it to Codable
, you would need to use a wrapper object. enumerateAttributes(in:options:using:)
should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.
As for encoding in binary, Codable
is completely agnostic to format, so you could write your own object conforming to Coder
that does whatever you want, including store everything using raw bytes.
Aside: Other Formats
Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!"
in the system font. After each format description (in parentheses) is the number of bytes to store that string.
Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML byNSAttributedString
. (536 bytes)
Binary Plist, which is made when you useNSKeyedArchiver
, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
Mac Simple Text seems to always generate an error. (N/A)
Final Note
In the end, the encoding experience for NSAttributedString
should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString
or some similar Swifty type conforms to Codable
out of the box and can then be paired with any file format Coder
.
edited Feb 12 at 23:29
answered Feb 12 at 21:33
Matthew SeamanMatthew Seaman
5,82522742
5,82522742
Thank you very much for your detailed answer! Do you also know a good/the best way to saveNSAttributedString
s which do not contain images with the least amount of memory usage?
– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
add a comment |
Thank you very much for your detailed answer! Do you also know a good/the best way to saveNSAttributedString
s which do not contain images with the least amount of memory usage?
– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
Thank you very much for your detailed answer! Do you also know a good/the best way to save
NSAttributedString
s which do not contain images with the least amount of memory usage?– Qbyte
Feb 12 at 22:28
Thank you very much for your detailed answer! Do you also know a good/the best way to save
NSAttributedString
s which do not contain images with the least amount of memory usage?– Qbyte
Feb 12 at 22:28
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
@Qbyte Yep, I added an aside to my answer with other formats.
– Matthew Seaman
Feb 12 at 23:23
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
Thanks for the detailed answer. This is very helpful to explore a solution that will work for my case.
– jmdecombe
Mar 29 at 15:52
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461875%2fhow-to-optimize-storage-of-nsattributedstring-in-swift-using-data-and-codable%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown